<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Every (hello@wentsch.me)</title>
    <link>https://every.to/feeds/1d9e62247f697a00709f</link>
    <description>Recent posts</description>
    <language>en-us</language>
    <ttl>40</ttl>
    <item>
      <title>How Anthropic Makes Claude More Reliable</title>
      <description>&lt;table&gt;&lt;tr&gt;&lt;td&gt;&lt;img alt="Context Window" src="https://d24ovhgu8s7341.cloudfront.net/uploads/publication/logo/94/small_context_windown_1.png" /&gt;&lt;/td&gt;&lt;td&gt;&lt;/td&gt;&lt;td&gt;&lt;table&gt;&lt;tr&gt;&lt;td&gt;by &lt;a href="https://every.to/@laura_27bbaf_1" itemprop="name"&gt;Laura Entis&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;in &lt;a href="https://every.to/context-window"&gt;Context Window&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;figure&gt;&lt;img src="https://d24ovhgu8s7341.cloudfront.net/uploads/post/cover/4307/full_page_cover_fc509aeb8c5cdfd5-cover-image-cw.png"&gt;&lt;figcaption&gt;Midjourney/Every illustration.&lt;/figcaption&gt;&lt;/figure&gt;&lt;p&gt;Living at the edge of AI is bittersweet. You can spend weeks building a workaround to a problem only for a frontier lab to swoop in and solve it for you in a more elegant, reliable way. Today, senior applied AI engineer &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@nityesh" rel="noopener noreferrer" target="_blank"&gt;Nityesh Agarwal&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; explains how Anthropic’s dynamic workflows feature made his elaborate Claude setup look clumsy in retrospect, the Every team shares which corners of the AI frontier they’ve given themselves permission to ignore, and executive operations manager &lt;strong&gt;&lt;u&gt;&lt;a href="https://www.linkedin.com/in/jalaiyah-bolden/" rel="noopener noreferrer" target="_blank"&gt;Jalaiyah Bolden&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; walks through her step-by-step process for turning a Slack bot into a reliable coworker.&lt;/p&gt;&lt;p&gt;&lt;em&gt;Every is off tomorrow for Juneteenth; we’ll be back Sunday. Was this newsletter forwarded to you? &lt;u&gt;&lt;a href="https://every.to/account" rel="noopener noreferrer" target="_blank"&gt;Sign up&lt;/a&gt;&lt;/u&gt; to get it in your inbox.&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;h2&gt;Mini-Vibe Check: Dynamic Workflows&lt;/h2&gt;&lt;h4&gt;A closer look at how Claude Code coordinates multiple agents &lt;/h4&gt;&lt;p&gt;When senior applied AI engineer Nityesh Agarwal &lt;u&gt;&lt;a href="https://every.to/p/what-i-learned-onboarding-our-ai-project-manager" rel="noopener noreferrer" target="_blank"&gt;built Every’s AI project manager&lt;/a&gt;&lt;/u&gt; Claudie, he spent days figuring out how to get around the model’s limited context window, or the cap on how much text an LLM can process at once—and the reason Claudie kept dropping key details. His solution: one coordinating agent that delegated tasks to fleets of &lt;u&gt;&lt;a href="https://every.to/source-code/claude-code-camp" rel="noopener noreferrer" target="_blank"&gt;subagents&lt;/a&gt;&lt;/u&gt;, which gathered data, made updates, and communicated with one another via local markdown files. The process was “a little bit hacky,” Nityesh says, but it worked. &lt;/p&gt;&lt;p&gt;If he were to build Claudie today, he could just use &lt;u&gt;&lt;a href="https://code.claude.com/docs/en/workflows" rel="noopener noreferrer" target="_blank"&gt;dynamic workflows&lt;/a&gt;&lt;/u&gt;, Anthropic’s feature for orchestrating large, multi-agent Claude Code tasks. Instead of deciding each step on the fly, Claude writes a reusable script that coordinates the work. It can assign tasks to many subagents and have them check each other’s work before reporting back the results.&lt;/p&gt;&lt;p&gt;Before dynamic workflows, trying to get Claude to reliably spawn reviewer agents was a persistent headache. Anxious about token spend, the model “would sometimes try to merge it all into one subagent,” Nityesh says, dragging down the quality of the results. Increasingly dramatic directives &lt;em&gt;not&lt;/em&gt; to do this often went unheeded. Now, if you tell Claude you want three verifier subagents with dynamic workflows, Claude will write a script that generates three subagents every time. &lt;/p&gt;&lt;p&gt;Nityesh is grateful for the new feature, but watching weeks of work get negated by a single release was also disheartening. “I spent so many weeks building that other thing. Now it’s useless,” he says. &lt;/p&gt;&lt;p&gt;“But that’s the cost of being at the frontier,” he continues. “You need to be ahead of everybody else, and sometimes that means you need to throw away your past work.”&lt;/p&gt;&lt;div class="quill-block-image" id="quill-block-image-1781799757790-3hjr6v9o1" data-source="{&amp;quot;dom_id&amp;quot;:&amp;quot;quill-block-image-1781799757790-3hjr6v9o1&amp;quot;,&amp;quot;link&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4307/optimized_df424727-8087-4c7a-a0b4-35dc3674f6fa.png&amp;quot;,&amp;quot;image&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4307/optimized_df424727-8087-4c7a-a0b4-35dc3674f6fa.png&amp;quot;,&amp;quot;caption&amp;quot;:&amp;quot;(Image courtesy of Anthropic.)&amp;quot;,&amp;quot;error&amp;quot;:null}"&gt;&lt;div&gt;&lt;a href="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4307/optimized_df424727-8087-4c7a-a0b4-35dc3674f6fa.png" target="_blank" rel="noopener noreferrer"&gt;&lt;img src="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4307/optimized_df424727-8087-4c7a-a0b4-35dc3674f6fa.png" alt="(Image courtesy of Anthropic.)"&gt;&lt;/a&gt;&lt;figcaption class="quill-image-caption"&gt;(Image courtesy of Anthropic.)&lt;/figcaption&gt;&lt;/div&gt;&lt;/div&gt;&lt;p&gt;&lt;br&gt;&lt;/p&gt;&lt;p&gt;&lt;strong&gt;A dynamic workflows case study.&lt;/strong&gt; For &lt;u&gt;&lt;a href="https://writewithspiral.com/?utm_source=everywebsite" rel="noopener noreferrer" target="_blank"&gt;Spiral’s redesign&lt;/a&gt;&lt;/u&gt;, senior designer &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@daniel_5fbd21_1" rel="noopener noreferrer" target="_blank"&gt;Daniel Rodrigues&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; sent the writing app’s general manager &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@marcus_fd8302_1" rel="noopener noreferrer" target="_blank"&gt;Marcus Moretti&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; a giant Figma file.&lt;/p&gt;&lt;p&gt;Marcus needed to convert the file into code. He did a pass in &lt;u&gt;&lt;a href="https://every.to/source-code/claude-code-for-product-managers" rel="noopener noreferrer" target="_blank"&gt;Claude Code&lt;/a&gt;&lt;/u&gt;, but the result had numerous errors. Before dynamic workflows, he would have flagged the mistakes in batches for Claude Code to fix—a repetitive, frustrating process.&lt;/p&gt;&lt;p&gt;Instead, Marcus asked Claude Code to set up a dynamic workflow that would review the Figma file section by section, extract all assets and design details, turn them into code, and check the results against the original file.&lt;/p&gt;&lt;p&gt;The Figma file had 11 sections, so Claude spun up 11 tasks, each with dedicated subagents. After running for a couple of hours, “it was not perfect,” Marcus says, but “it saved me a whole bunch of time.” Before dynamic workflows, each of the reviewer subagents would have been Marcus himself.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Try it yourself:&lt;/strong&gt; For complex projects like a code migration, changing the programming language a product uses, or a major upgrade, dynamic workflows might be a good solution, Marcus says. To initiate the feature, you can simply type “workflow” in a Claude Code session, or include “ultracode” in the prompt. &lt;/p&gt;&lt;p&gt;Or test out &lt;u&gt;&lt;a href="https://every.to/p/claude-fable-5-prompt-library?source=post_button#prompt-section-dynamic-workflow" rel="noopener noreferrer" target="_blank"&gt;Nityesh’s prompt&lt;/a&gt;&lt;/u&gt; for kicking off a dynamic workflow. &lt;/p&gt;&lt;h3&gt;&lt;hr class="quill-line"&gt;&lt;/h3&gt;&lt;h2&gt;Permission to skip&lt;/h2&gt;&lt;h4&gt;&lt;strong&gt;Rapid-fire roundup edition &lt;/strong&gt;&lt;/h4&gt;&lt;p&gt;The pace of AI is unrelenting. Each week brings new model releases, benchmark results, and “paradigm shifts” that sometimes turn out to be incremental upgrades. &lt;/p&gt;&lt;p&gt;At Every, we do our very best to stay at the frontier—but for better and worse, we are human, which means we cannot run all night. Here, Every staffers share what they’ve given themselves permission to skip in order to, you know, sleep, &lt;u&gt;&lt;a href="https://knowyourmeme.com/memes/touch-grass" rel="noopener noreferrer" target="_blank"&gt;touch grass&lt;/a&gt;&lt;/u&gt;, or run other AI experiments. [Disclaimer: All of this is, of course, subject to change!] &lt;/p&gt;&lt;p&gt;&lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@mike_2114" rel="noopener noreferrer" target="_blank"&gt;Mike Taylor&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;, head of tech consulting: “All model releases that didn’t come from Anthropic or OpenAI.” Exhausted from putting Fable 5 &lt;u&gt;&lt;a href="https://every.to/vibe-check/anthropic-mythos-our-fable-vibe-check" rel="noopener noreferrer" target="_blank"&gt;through its paces&lt;/a&gt;&lt;/u&gt;, Mike recently declined the chance to beta-test a model from another big tech company. “Maybe I’m stupid for saying this, but I don’t expect it to be as good as what we’re currently testing,” he says. “I might just not be able to make the time. Whereas if it’s something I think is really, really good, then I’ll miss my kid’s birthday party to test it.”&lt;/p&gt;&lt;p&gt;&lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@kieran_1355" rel="noopener noreferrer" target="_blank"&gt;Kieran Klaassen&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;, general manager of &lt;strong&gt;&lt;u&gt;&lt;a href="https://cora.computer/?utm_source=everywebsite" rel="noopener noreferrer" target="_blank"&gt;Cora&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;: “Open-source models. There are so many, and they’re not as good as the best models, so I skip all of them.”&lt;/p&gt;&lt;p&gt;&lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@williewilliams" rel="noopener noreferrer" target="_blank"&gt;Willie Williams&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;, head of platform: New, complex development workflows—&lt;u&gt;&lt;a href="https://every.to/guides/compound-engineering" rel="noopener noreferrer" target="_blank"&gt;compound engineering&lt;/a&gt;&lt;/u&gt; has cured his FOMO when he encounters them on X. “I just use the /lfg workflow from compound engineering,” he says. “Now I’m like, ‘It’s all I need.’”&lt;/p&gt;&lt;p&gt;Daniel Rodrigues&lt;strong&gt;,&lt;/strong&gt; senior designer: Agent-ready &lt;u&gt;&lt;a href="https://x.com/emmettshine/status/2054539694097015171?s=20" rel="noopener noreferrer" target="_blank"&gt;design systems&lt;/a&gt;&lt;/u&gt;, in which you translate brand assets and design decisions into code an agent can use. He sees the value in this way of working, particularly on the client side, but building these complex systems is not how he’d choose to spend his time if given the choice. Why? “It doesn’t look fun, and design should be fun,” he says. Amen. &lt;/p&gt;&lt;h3&gt;&lt;hr class="quill-line"&gt;&lt;/h3&gt;&lt;h2&gt;Steal this workflow&lt;/h2&gt;&lt;h3&gt;&lt;strong&gt;Treat your Slack bot like a coworker &lt;/strong&gt;&lt;/h3&gt;&lt;p&gt;Jalaiyah Bolden is Every’s executive operations manager: She manages CEO &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@danshipper" rel="noopener noreferrer" target="_blank"&gt;Dan Shipper&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;’s calendar, internal events, operations projects, and customer support. Her small but mighty teams use a number of &lt;u&gt;&lt;a href="https://every.to/on-every/introducing-plus-one-one-click-openclaw-agents-by-every" rel="noopener noreferrer" target="_blank"&gt;Slack-based agents&lt;/a&gt;&lt;/u&gt;—most notably &lt;u&gt;&lt;a href="https://viktor.com/hire-an-ai-employee?gad_source=1&amp;amp;gad_campaignid=23610878065&amp;amp;gbraid=0AAAABC9uvB8FNNXa68YTUblEdXWCBkUNY&amp;amp;gclid=Cj0KCQjwrs7RBhDuARIsAIVfBD2MJaAgkraMLb-C3So4xfSrI52cuzNBBVFsbJ3A-hAlgMHeBHuxopAaAsbZEALw_wcB" rel="noopener noreferrer" target="_blank"&gt;Viktor&lt;/a&gt;&lt;/u&gt;— to automate or assist with the myriad tasks regularly thrown at them. &lt;/p&gt;&lt;p&gt;Here’s Jalaiyah’s step-by-step process for getting an agent to take over routine tasks such as auditing ticket closures, creating discount codes, or compiling payment dispute evidence. &lt;/p&gt;&lt;ol&gt;&lt;li&gt;&lt;strong&gt;Pick one recurring job&lt;/strong&gt;. Start with something low-stakes but annoying. Anything that comes across your desk more than once could qualify: Maybe it’s a templated but extensive weekly report, or a customer support audit, or a workflow to create coupon codes. Whatever it is, make sure the agent has access to the systems it needs to find or verify information, and then describe the output you want. For example: “Audit Fin [formerly Intercom, a customer support platform] closures from the last 12 hours and flag anything suspicious.”&lt;/li&gt;&lt;li&gt;&lt;strong&gt;Start a conversation&lt;/strong&gt;. Treat the bot like a new hire. Jalaiyah’s go-to prompt: “What other information do you need to be able to make this repeatable and consistent over time?” Let the agent ask clarifying questions, then give it rules, examples, edge cases, and a clear definition of what “good” and “bad” look like.&lt;/li&gt;&lt;li&gt;&lt;strong&gt;Review and revise what it comes back with&lt;/strong&gt;. Let the agent gather information and draft a response. Then review the result yourself, explain what needs to be handled differently, and fill in any information gaps. The result should improve the next time around. &lt;/li&gt;&lt;/ol&gt;&lt;p&gt;&lt;strong&gt;Try it this week:&lt;/strong&gt; Choose one recurring task and ask your agent: “What other information do you need from me to run this reliably every week?”&lt;/p&gt;&lt;h3&gt;&lt;hr class="quill-line"&gt;&lt;/h3&gt;&lt;h2&gt;One last thing&lt;/h2&gt;&lt;h4&gt;&lt;strong&gt;Mistral who? &lt;/strong&gt;&lt;/h4&gt;&lt;p&gt;When it was announced the heads of the top AI companies would convene &lt;u&gt;&lt;a href="https://www.businessinsider.com/anthropic-dario-amodei-openai-sam-altman-demis-hassabis-g7-lunch-2026-6" rel="noopener noreferrer" target="_blank"&gt;in person at the G-7 Summit&lt;/a&gt;&lt;/u&gt; to discuss the “opportunities and dangers” poised by the technology, the lineup was who’d you’d expect: OpenAI CEO &lt;strong&gt;Sam Altman&lt;/strong&gt;, Anthropic CEO &lt;strong&gt;Dario Amodei&lt;/strong&gt;, Google DeepMind CEO &lt;strong&gt;Demis Hassabis&lt;/strong&gt;, Meta chief AI officer &lt;strong&gt;Alexandr Wang&lt;/strong&gt;, and &lt;strong&gt;Arthur Mensch&lt;/strong&gt;, the CEO of &lt;u&gt;&lt;a href="https://chat.mistral.ai/chat" rel="noopener noreferrer" target="_blank"&gt;Mistral&lt;/a&gt;&lt;/u&gt;. &lt;/p&gt;&lt;p&gt;If you’re wondering how Mistral made the cut, &lt;u&gt;&lt;a href="https://x.com/amasad/status/2066700847187140655?ref_src=twsrc%5Etfw%7Ctwcamp%5Etweetembed%7Ctwterm%5E2066723871198208351%7Ctwgr%5E7aded428e33cda8ee66bbd483b399fd9772452cf%7Ctwcon%5Es2_&amp;amp;ref_url=https%3A%2F%2Fwww.businessinsider.com%2Fwhat-is-le-chaton-fat-mistral-meme-explained-ai-model-2026-6" rel="noopener noreferrer" target="_blank"&gt;you’re not alone&lt;/a&gt;&lt;/u&gt;. A French maker of open-source models, Mistral is “among the most vaunted artificial intelligence firms in Europe,” per the&lt;em&gt; &lt;u&gt;&lt;a href="https://www.nytimes.com/2026/06/17/world/europe/g7-summit-ai-tech-leaders-openai-anthropic.html" rel="noopener noreferrer" target="_blank"&gt;New York Times&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;. But on X, the company is perhaps more famous for “Le Chaton Fat,” a &lt;u&gt;&lt;a href="https://x.com/AlexanderKnigge/status/2066267845546442762" rel="noopener noreferrer" target="_blank"&gt;spoof all-powerful model&lt;/a&gt;&lt;/u&gt; that &lt;u&gt;&lt;a href="https://www.businessinsider.com/what-is-le-chaton-fat-mistral-meme-explained-ai-model-2026-6" rel="noopener noreferrer" target="_blank"&gt;went viral online&lt;/a&gt;&lt;/u&gt; after Mistral announced it was renaming its chatbot from Le Chat to Vibe, leading commentators to post about other potential cat-themed successors to Le Chat. &lt;/p&gt;&lt;div class="quill-block-image" id="quill-block-image-1781799757808-kg4bag1vo" data-source="{&amp;quot;dom_id&amp;quot;:&amp;quot;quill-block-image-1781799757808-kg4bag1vo&amp;quot;,&amp;quot;link&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4307/optimized_c090a2fc-1093-4d81-b116-aa8429090f69.png&amp;quot;,&amp;quot;image&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4307/optimized_c090a2fc-1093-4d81-b116-aa8429090f69.png&amp;quot;,&amp;quot;caption&amp;quot;:&amp;quot;(Screenshot courtesy of X/Laura Entis.)&amp;quot;,&amp;quot;error&amp;quot;:null}"&gt;&lt;div&gt;&lt;a href="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4307/optimized_c090a2fc-1093-4d81-b116-aa8429090f69.png" target="_blank" rel="noopener noreferrer"&gt;&lt;img src="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4307/optimized_c090a2fc-1093-4d81-b116-aa8429090f69.png" alt="(Screenshot courtesy of X/Laura Entis.)"&gt;&lt;/a&gt;&lt;figcaption class="quill-image-caption"&gt;(Screenshot courtesy of X/Laura Entis.)&lt;/figcaption&gt;&lt;/div&gt;&lt;/div&gt;&lt;p&gt;&lt;br&gt;&lt;/p&gt;&lt;p&gt;We still don’t know exactly what was discussed in the AI huddle, but Mistral’s inclusion led to, you guessed it, &lt;u&gt;&lt;a href="https://x.com/IntCyberDigest/status/2067377692027056212" rel="noopener noreferrer" target="_blank"&gt;more&lt;/a&gt;&lt;/u&gt; &lt;u&gt;&lt;a href="https://x.com/julien_c/status/2067218715142185435" rel="noopener noreferrer" target="_blank"&gt;memes&lt;/a&gt;&lt;/u&gt;.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;&lt;strong&gt;&lt;em&gt;&lt;a href="https://every.to/@laura_27bbaf_1" rel="noopener noreferrer" target="_blank"&gt;Laura Entis&lt;/a&gt;&lt;/em&gt;&lt;/strong&gt;&lt;em&gt; is a staff writer at Every. You can follow her on &lt;a href="https://www.linkedin.com/in/lauraentis/" rel="noopener noreferrer" target="_blank"&gt;LinkedIn&lt;/a&gt;. &lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;To read more essays like this, subscribe to &lt;u&gt;&lt;a href="https://every.to/subscribe" rel="noopener noreferrer" target="_blank"&gt;Every&lt;/a&gt;&lt;/u&gt;, and follow us on X at &lt;u&gt;&lt;a href="http://twitter.com/every" rel="noopener noreferrer" target="_blank"&gt;@every&lt;/a&gt;&lt;/u&gt; and on &lt;u&gt;&lt;a href="https://www.linkedin.com/company/everyinc/" rel="noopener noreferrer" target="_blank"&gt;LinkedIn&lt;/a&gt;&lt;/u&gt;.&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;We &lt;u&gt;&lt;a href="https://every.to/studio" rel="noopener noreferrer" target="_blank"&gt;build AI tools&lt;/a&gt;&lt;/u&gt; for readers like you. Write brilliantly with &lt;/em&gt;&lt;strong&gt;&lt;em&gt;&lt;u&gt;&lt;a href="https://writewithspiral.com/" rel="noopener noreferrer" target="_blank"&gt;Spiral&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;&lt;/strong&gt;&lt;em&gt;. Organize files automatically with &lt;/em&gt;&lt;strong&gt;&lt;em&gt;&lt;u&gt;&lt;a href="https://makeitsparkle.co/?utm_source=everyfooter" rel="noopener noreferrer" target="_blank"&gt;Sparkle&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;&lt;/strong&gt;&lt;em&gt;. Deliver yourself from email with &lt;u&gt;&lt;a href="https://cora.computer/" rel="noopener noreferrer" target="_blank"&gt;Cora&lt;/a&gt;&lt;/u&gt;. Dictate effortlessly with &lt;/em&gt;&lt;strong&gt;&lt;em&gt;&lt;u&gt;&lt;a href="https://monologue.to/" rel="noopener noreferrer" target="_blank"&gt;Monologue&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;&lt;/strong&gt;&lt;em&gt;. Collaborate with agents on documents with &lt;/em&gt;&lt;strong&gt;&lt;em&gt;&lt;a href="https://www.proofeditor.ai/" rel="noopener noreferrer" target="_blank"&gt;Proof&lt;/a&gt;&lt;/em&gt;&lt;/strong&gt;&lt;em&gt;. &lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;For sponsorship opportunities, reach out to sponsorships@every.to.&lt;/em&gt;&lt;/p&gt;&lt;div class="quill-button" data-source="{&amp;quot;id&amp;quot;:&amp;quot;quill-button-1769187301610&amp;quot;,&amp;quot;url&amp;quot;:&amp;quot;https://every.to/subscribe?source=post_button&amp;quot;,&amp;quot;text&amp;quot;:&amp;quot;Subscribe&amp;quot;}" id="quill-button-1769187301610"&gt;&lt;a href="https://every.to/subscribe?source=post_button"&gt;Subscribe&lt;/a&gt;&lt;/div&gt;</description>
      <author>Laura Entis / Context Window</author>
      <pubDate>2026-06-18 12:00:00 -0400</pubDate>
      <guid>https://every.to/context-window/how-anthropic-makes-claude-more-reliable</guid>
      <link>https://every.to/context-window/how-anthropic-makes-claude-more-reliable</link>
    </item>
    <item>
      <title>Transcript: ‘Can GitHub Be for Everyone?’</title>
      <description>&lt;table&gt;&lt;tr&gt;&lt;td&gt;&lt;img alt="AI &amp;amp; I" src="https://d24ovhgu8s7341.cloudfront.net/uploads/publication/logo/97/small_ai_and_i_cover_1.png" /&gt;&lt;/td&gt;&lt;td&gt;&lt;/td&gt;&lt;td&gt;&lt;table&gt;&lt;tr&gt;&lt;td&gt;by &lt;a href="https://every.to/@mike_2114" itemprop="name"&gt;Mike Taylor&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;in &lt;a href="https://every.to/podcast"&gt;AI &amp;amp; I&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;p&gt;The transcript of &lt;em&gt;AI &amp;amp; I&lt;/em&gt; with &lt;strong&gt;&lt;a href="https://every.to/@mike_2114" rel="noopener noreferrer" target="_blank"&gt;Mike Taylor&lt;/a&gt;&lt;/strong&gt; and GitHub COO &lt;strong&gt;Kyle Daigle&lt;/strong&gt; is below. Watch on &lt;a href="https://x.com/danshipper/status/2067292771522654626" rel="noopener noreferrer" target="_blank"&gt;X&lt;/a&gt; or &lt;a href="https://youtu.be/OCEVqy8kl7Q" rel="noopener noreferrer" target="_blank"&gt;YouTube&lt;/a&gt;, or listen on &lt;a href="https://open.spotify.com/episode/62NJTryUh6D8idheRZJm0e?si=NEE6UvQzRym2jnak7gFbUg" rel="noopener noreferrer" target="_blank"&gt;Spotify&lt;/a&gt; or &lt;a href="https://podcasts.apple.com/us/podcast/githubs-coo-explains-why-ai-hasnt-replaced-developers/id1719789201?i=1000773140257" rel="noopener noreferrer" target="_blank"&gt;Apple Podcasts&lt;/a&gt;.&lt;/p&gt;&lt;h2&gt;Timestamps&lt;/h2&gt;&lt;ol&gt;&lt;li&gt;Introduction: 00:00:52&lt;/li&gt;&lt;li&gt;The agentic PR flood: 00:03:27&lt;/li&gt;&lt;li&gt;GitHub’s approach to helping open-source maintainers manage the surge: 00:04:33&lt;/li&gt;&lt;li&gt;What 14 billion commits means for code quality: 00:06:15&lt;/li&gt;&lt;li&gt;Moving from per-seat licensing to usage-based pricing: 00:08:03&lt;/li&gt;&lt;li&gt;Kyle’s dual role as GitHub COO and Microsoft’s chief marketing officer for developers: 00:09:45&lt;/li&gt;&lt;li&gt;Developer choice as competitive moat: 00:13:03&lt;/li&gt;&lt;li&gt;How to balance dogfooding your own tools with staying honest about the competition: 00:14:57&lt;/li&gt;&lt;li&gt;Hill climbing, frontier tuning, and solving the model-routing problem: 00:19:45&lt;/li&gt;&lt;li&gt;Kyle’s agentic communication hack: 00:24:45&lt;/li&gt;&lt;/ol&gt;&lt;h2&gt;Transcript&lt;/h2&gt;&lt;p&gt;&lt;strong&gt;(00:00:52)&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Mike Taylor&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;First thing I wanted to ask about—and we were touching on it yesterday—is that the demographics of your customer are changing, right? A lot of people who previously would never have used GitHub, or never used developer products before, are now using them. How has that changed the way you decide the product roadmap?&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Kyle Daigle&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;For GitHub in particular, we’ve always had this really expansive view of what a developer is. I started as a developer before I would have ever called myself one. I was just writing code for myself, and I went down a completely different career path—I didn’t go to school for computer science. I was going to art school. I wrote code to pay for art school, which, as an adult now, seems like a very silly decision.&lt;/p&gt;&lt;p&gt;But I had that journey of realizing: I can create tools and deliver them to people who can have that same experience of just wanting to build an app for themselves or their family, maybe as a startup, maybe as a business. We very much have serious developer tools—all the largest businesses are using GitHub—but when I look at something like the GitHub Copilot app, I see just as many developers using AI every day, running multiple projects, all kinds of agent sessions at the same time. I also see our legal team at GitHub using the Copilot app, or our finance team. I was meeting with a customer today and they were saying the same thing. A lot of the folks the industry would call knowledge workers, or non-by-trade developers, are using these tools to build little apps or assets for themselves.&lt;/p&gt;&lt;p&gt;So while our focus is very much on developers, we want to make it easier for people to choose to try to write some code—and make sure there’s always an on-ramp, now with things like the GitHub Copilot app.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;(00:03:27)&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Mike Taylor&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;And then how do you help developers deal with the burden of all that extra volume? Open-source maintainers I talk to are drowning in PRs. What needs to happen to help them?&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Kyle Daigle&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;For all developers, we’re building tools like the Copilot code review. It’s now agentic, so it finds a lot more novel vulnerabilities, and you can comment and the agent will take that on and go implement the change if you want. That code review step is, in some ways, overlooked as a really great way to get PRs to a place that are much more easily reviewed.&lt;/p&gt;&lt;p&gt;The agentic merge in the app is another place where we see a lot of value—both internally and in the community. You may comment on something that has a code review, you go through and get it almost all the way there, but then there are all those manual steps to finish processing the PR. Instead, I can go in and set exactly what I want to allow GitHub Copilot to do and say, ‘Okay, now go merge this PR, wait for CI, wait for policies’—all of that. That’s a big part of it.&lt;/p&gt;&lt;p&gt;On the open-source side, it’s a unique set of needs because you don’t control who’s sending everything in, and that’s really where we’ve been focusing—giving maintainers more tools to decide: do you want to accept all of these PRs? Who do you want to accept them from? How much work does someone need to do to prove they’re going to contribute something meaningful to this project? We want to provide those tools while leaving maintainers in control.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;(00:04:33)&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Mike Taylor&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;Every community is choosing a slightly different way to approach the problem—&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Kyle Daigle&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;For GitHub, we’ve always wanted to leave that in their hands. Give them tools and enable them, but if a standard emerges or most communities settle on a certain practice, we’ll lock that in. But we don’t really ever want to be the first to create a standard or an approach.&lt;/p&gt;&lt;p&gt;Mitchell Hashimoto shared the vouch system they use, and I was getting questions like, ‘Well, why aren’t you rolling this out to everybody?’ But there are just as many communities that don’t want that system because they have their own ideas of how it should work. For now we’re focusing on the building blocks of controls for maintainers, and as we all learn together and maintainers send feedback in, we’ll cement an entire system if one emerges.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;(00:06:15)&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Mike Taylor&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;I feel like you have a front-row seat to this new agent economy. You said publicly on Twitter that you’ve had more pull requests submitted per month than you did all of last year. How are those stats exploding?&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Kyle Daigle&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;We’re seeing way more activity on GitHub. We’ve always talked about user growth, but this year we’re seeing the growth of developers building with agents. Last year at GitHub Universe in October, we shared there had been a billion commits on GitHub for the full year. We’re on track to be 14 billion if the growth were linear—which it won’t be. In March, there were 17 million pull requests created by agents. That’s just the agent pull requests.&lt;/p&gt;&lt;p&gt;There’s so much more code being created, and I think at times everyone goes, ‘Oh, this is all just slop—it’s code being pushed up and no one cares.’ That’s not really true. We’re all just actually getting to the point where we’re no longer in super-early adoption. We’re definitely not at the peak, but we’re climbing that hill, seeing what we can build when it’s not just Kyle building, but Kyle plus one, two, to N agents that are using my skills, using my resources, using my context, and so on. We’re investing heavily in preparing for the next wave of growth because it doesn’t seem to be growing and plateauing. It’s just going to continue because no matter what you’re building or what tools you’re using to build, all of that code ends up on GitHub, or that’s where you’re sharing it with the world, or that’s where you’re collaborating in a PR. We need to be able to support everyone’s agent moment—not just GitHub Copilot.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;(00:08:03)&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Mike Taylor&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;And how does the business model change? Freemium makes sense in a human-centered world where we go to bed. But agents are still working while we’re asleep now. Does that shift to usage-based pricing? Is that where things are going?&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Kyle Daigle&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;I don’t think we know yet, ultimately. Right now it’s very much: Kyle has a license, or Kyle’s using GitHub.com for free. We’ve always had API rate limits and things like that, and that’s usually where folks are seeing the agent back pressure.&lt;/p&gt;&lt;p&gt;The goal is that if you want to do way more—if you want, as Peter Steinberger says, 150 agents doing everything all at once—great, we want to enable that. But at the same time, I want you to have a great core GitHub experience, and at the very least there’s some amount of agent usage as part of that that’s necessary.&lt;/p&gt;&lt;p&gt;It’s similar to how way back you’d have free public repos but not free private repos. Then we said, ‘Okay, it’s fair for an individual to have some code they don’t want to put into the world, and we’ll give you free private repos.’ GitHub’s always evolving as the industry and community does. But we’re always focused on: I need to make sure you, the dev, have what you need to be successful—and then work with enterprises to make sure they have what they need at scale, which is usually a little bit different from what an individual dev is doing.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;(00:09:45)&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Mike Taylor&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;The business model and pricing all leads back into the wider Microsoft orbit. You now have a dual role—partial responsibility for the wider marketing org as well. Can you talk through how that’s changed and how you prioritize between the two?&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Kyle Daigle&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;I’ve been at GitHub for 13 years—as a developer myself and leading engineering teams for a lot of that time. What’s always been unique about GitHub is we really, really focus on the developer. We’re building tools for developers, and the fact that enterprises are buying them is awesome. But we’re not building for the buyers. We’re building for the developers.&lt;/p&gt;&lt;p&gt;That’s been my focus as the COO of GitHub, and now as the chief marketing officer for developer at Microsoft, my goal is to look across all of Microsoft’s tooling—their developer tools, the technology they’re bringing to developers—and make sure we’re bringing holistic solutions that are authentic to developer experiences. At events like this, we’ve taken a very different approach to Build this year. We’re in San Francisco, first off. The vibe is a bit different from the conference hall setup. It’s really focused on: Can I go to a session? Can I use the thing? I don’t want to be pitched on a thing—I have to be able to use it. It’s bringing that expertise and love and focus on the developer that GitHub’s always had to have an even broader impact throughout Microsoft.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Mike Taylor&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;Did I hear you say this is the first Build that’s had external contributors?&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Kyle Daigle&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;It’s the first Build that by intention we’ve focused on having speakers from the community in these primary sessions. That includes the keynote, where we had a bunch of folks like Peter. There are sessions from Swyx and others as well.&lt;/p&gt;&lt;p&gt;I think it’s important—software development is a team sport. It seems silly to think there’s any one company, one group, inclusive of GitHub and Microsoft and everyone, that can answer every single question. That’s not how software gets made. We’re all at least using open source and building on the backs of these giant open-source projects. Let’s invite people in who can help tell their part of the story, because I deeply believe that’s what developers want. I know that’s what I want. I know that’s what my friends who are developers want. When we look at the events and hear the feedback, they’re excited to see people from Microsoft, from GitHub, and then, oh, I also get to see this outside perspective at this event.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;(00:13:03)&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Mike Taylor&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;It’s a very competitive market—maybe the most competitive market. How do you differentiate given the pace of change is so quick?&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Kyle Daigle&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;We continue to focus on our roots, which is that we care a lot about developer choice. It’s always been true. We care about building for builders and enabling builders. We’re in a moment that’s really interesting because we went from an era of having a ton of APIs and all this access to—kind of unintentionally—a walled garden setup. Where you get a certain affinity, sometimes I’ll call it a little bit of a mousetrap, and then you realize, ‘Oh, this thing is really interesting over here,’ and then you have to go learn a new tool or create a new account.&lt;/p&gt;&lt;p&gt;For us, we always want to enable developers building with GitHub to go use other tools, and we’ll partner with everyone to make that as simple as possible. I think the ability to do that across the entirety of building software—not just the code-gen side or just the collaboration review side, but across everything—is a real superpower of ours.&lt;/p&gt;&lt;p&gt;You’ll see us invest in our own tech, like the new Microsoft AI models that we’ll continue to bring to developers. We’re also continuing to partner with Anthropic, OpenAI, Google, and anyone bringing a model to market or a coding agent to market. We’ll partner with them and let you bring that to us, or we’ll surface it through GitHub and GitHub Copilot. That choice is core and something I don’t think we’ll ever back down on. Because if we do, developers will still choose—they’ll just be stuck in another mousetrap, and we don’t want the world of software to be like that.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;(00:14:57)&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Mike Taylor&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;When you’re making decisions internally—there was a news cycle recently about Claude Code licenses being canceled—how do you make the trade-off between dogfooding your own products, like using the new models you made or the GitHub Copilot desktop app, versus letting developers experiment with other tools?&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Kyle Daigle&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;We all use a variety of tools, because otherwise you lose track or you get too focused on your own work. For me personally, I’ve been a daily driver of a MacBook for many years. I use Windows PCs on weekends when I play video games. When I got this role, I set up my Mac, my PC, and an Omarky Linux box so I can make sure that every weekend—I code most Saturdays, I do my kids’ sports activities in the morning and then in the afternoon I’m coding—I’m swapping between the boxes because I want to understand each experience.&lt;/p&gt;&lt;p&gt;I only use the GitHub Copilot app on Windows because I want to make sure that developers on Windows also deserve great apps—not just the audience on a Mac. That’s true across our teams, especially when we’re looking at coding agents, harnesses, desktop apps, memory management, everything. We have a really great culture of just experimentation. Everyone is building and using these tools.&lt;/p&gt;&lt;p&gt;While we’re obviously putting most of our energy into our own tools, it’s such a blind spot if you don’t look elsewhere. It’s happened to GitHub in the past—when you’re doing something well, you laser focus, and that’s what every piece of startup energy says: look down and just keep moving fast. I think that’s myopic. While I can’t spend every day using every tool, when something comes out, I want to know why it’s great. Why are people having a great experience with this? Not only so I can understand, but so I can figure out: for our goals, for our goal of developer choice, what do I need here? I want to know why a dev would pick these tools.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Mike Taylor&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;And how do you filter? Because a lot of these ideas are relatively short-lived, and enterprise product development cycles are longer-lived. How do you decide?&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Kyle Daigle&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;Right now we’re in a moment where we’re really looking at the short term and capturing the ability to have a multitude of agent sessions. This idea of just—because it seems quite clear—everyone’s doing it. How can we cement it?&lt;/p&gt;&lt;p&gt;But it seems clear on the longer-term path that models are going to continue to get better. Token economics are going to be a bigger factor in what models everyone is using. I strongly believe we’re not very far off from having serious ability to use something above a small language model on a local device to do some of our work.&lt;/p&gt;&lt;p&gt;If I assume I have all this optionality when it comes to tokens, the thing that seems to be true from the beginning to Claude to now is this idea of personalization—mine, context, fine-tuning with context, memory. All of these ideas seem to be a truth that’s been there since ChatGPT came out or GitHub Copilot came out. There are experiments, but not a long-term vision for this across the industry.&lt;/p&gt;&lt;p&gt;I think it’s a good example of where I need to get you to use agents incredibly well, a lot of them, because if you’re into using agents you’re not just going to be staring at a single agent working. But that’s not going to give you a long-term great experience. Using an agent that you feel like is completing a thought for you will give you that great experience, especially if you did not have to personally codify that thought. Always remember that I insert that thing, you know? That’s a lot of work. It should be able to intuit that—or potentially post-train, fine-tune, or frontier-tune a model that deeply understands me and how I’m using the work. Sometimes it’s short term, and sometimes you have to take a bunch of attempts at the long term to get to something really tangible.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;(00:19:45)&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Mike Taylor&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;I heard the term ‘hill climbing’ about 100 times yesterday. I’m a big proponent of it—I’ve experimented with DSPy, Auto Research, a few others. Can you talk a bit about how that’s become such a big focus?&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Kyle Daigle&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;Satya and Mustafa talk about it a fair bit, and Jacob too, leading the Copilot group. The biggest thing we’ve learned is we need to use the actual use of the tools as a core way to improve the underlying use of the models and our own models—the evals that are necessary to ensure we’re actually improving, from things like using thumbs up and thumbs down data to whether you’re accepting a suggestion and how much you’re accepting.&lt;/p&gt;&lt;p&gt;All of that data is enormous in creating a magical type of experience—not just for you, but for everyone. So every week we’re talking about the hill climbing results. We’re looking at the data, the improvement, both the hard measures and the soft measures—because sometimes the hard measures and evals and rubrics will show that we’ve made an improvement, but user sentiment will crash. Even with the same latency and performance.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Mike Taylor&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;That’s overfitting, basically.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Kyle Daigle&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;100%. Being able to do that loop incredibly quickly is the goal, and then the main goal is giving everyone one of these hill climbing machines and not having you do it the hard way we’ve all been doing it.&lt;/p&gt;&lt;p&gt;Particularly if you’re in an enterprise and you’re using Microsoft 365, we know so much about that data—or could know so much—because of all the assets, all the documents, the chats. Being able to turn on something like frontier tuning and using MAI Thinking 1 as the base model shows real results without having to do all that extra work. And it’s been interesting because when I first heard about this, I’ll be honest: I was like, ‘This is like a magic parlor trick that is not going to be real.’ But the reality is that sometimes where the alpha is, is where it feels like this is too simple to work. We all have all this data, and what are we gonna do with it? We have to do all this effort to make it work. But so much has come down the pipe to allow us to just use the data and improve, look at the workflow and improve, and keep doing the hill climbing.&lt;/p&gt;&lt;p&gt;That’s why I think we say it so much—it’s not these moonshots. It’s climb, climb, improve, new eval, improve, new data, improve, and just keep going to get to the point where we’re able to launch these models for ourselves. Then allow customers to use the same or similar tooling to do it.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Mike Taylor&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;Is that the answer to stopping the $200 subscription from becoming a $2,000 subscription?&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Kyle Daigle&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;I think the $200-to-$2,000 problem is going to be addressed not only by frontier-tuning models so they know you better, but really by helping developers automatically choose the right models—potentially having a model in that step, like the model router in GitHub Copilot. Microsoft Foundry has a model router as well that can do this at an API level.&lt;/p&gt;&lt;p&gt;The more we can help you tell us where your bars are—like, this is an incredibly hard problem and I’m willing to go all the way to the top, or I just want to sit here—and let us help choose the models, the better. Because a lot of the reason tokens are expensive is that we’re all going and choosing our model of the day or week or hour, and those models are incredibly expensive.&lt;/p&gt;&lt;p&gt;My train of thought is slipping in and out of a hard problem to a simple problem. I’ll personally get an agent to do an enormous amount of work, and then there’s always that last step that’s a small thing—like, ‘Oh, just change all the naming of this to that.’ That’s a find and replace. Am I going to actually go save tokens by switching down to Haiku or something? Probably not. But the tools could. And that will really help us, particularly in the enterprise—but even for individual developers and folks building automations using the Copilot SDK to power that.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;(00:24:45)&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Mike Taylor&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;I did something a little bit weird. I hope you don’t find it creepy, but I made an AI version of you to practice this interview.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Kyle Daigle&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;No way.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Mike Taylor&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;Yeah, and it’s actually been pretty spot on so far, and hopefully you think the questions have been good.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Kyle Daigle&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;They’ve been great.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Mike Taylor&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;It’s just in the terminal—I didn’t go the full whack and make a video thing. I’m a bit shy. But I found it immensely useful. I wanted to ask: what other weird things are you seeing people do internally or externally?&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Kyle Daigle&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;It’s so funny you say that because I do a very similar thing. I have both, via the app and then I have a Claude that can’t talk to work stuff—just so I have separation of state. Where I spend a lot of time having it read everything I write and say. This interview will get fed into it ultimately. And every day I get a comms report that’s not like, ‘What Kyle said,’ but like—‘Kyle, you keep saying this.’ ‘This isn’t super clear.’ Based on how you speak, because I find that I write and speak in a very particular way—I want to use a lot of metaphors. So it’ll just give me examples of metaphors that are clear.&lt;/p&gt;&lt;p&gt;I find the self-improvement loop as a human, from these agents, to be incredibly powerful. We used to talk about it way back with Hubot at GitHub, like ChatOps. We used to say: humans are way more willing to take critical feedback from robots than from other humans.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Mike Taylor&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;It’s less threatening.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Kyle Daigle&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;100%. And when my Claude instance—which I affectionately named Baxter—tells me how terrible I did at something, I feel way better going, ‘Tell me why.’ And then ensuring that when I’m writing emails, writing a script, or reviewing details, it’s giving me that feedback. So a lot of my agent loop is really about me and less about the software side. I still have all those tools too, but it’s always looking backwards—‘Okay, the last seven days, read all Kyle’s emails and Slack messages and give me feedback.’ Then look back at what the agent told me to do, did Kyle do it, and go back another seven days. That loop is super powerful and I think, honestly, the type of personal consumer experience I want out of AI—to be able to tune these tools that way.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Mike Taylor&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;We need to recursively self-improve as well.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Kyle Daigle&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;100%.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Mike Taylor&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;Thanks so much, Kyle. I appreciate it.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Kyle Daigle&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;Enjoyed it.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;&lt;strong&gt;&lt;em&gt;&lt;u&gt;&lt;a href="https://every.to/@mike_2114" rel="noopener noreferrer" target="_blank"&gt;Mike Taylor&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;&lt;/strong&gt; &lt;em&gt;is the head of tech consulting at Every and a co-author of &lt;/em&gt;&lt;u&gt;&lt;a href="https://www.oreilly.com/library/view/prompt-engineering-for/9781098153427/" rel="noopener noreferrer" target="_blank"&gt;Prompt Engineering for Generative AI&lt;/a&gt;&lt;/u&gt; (O’Reilly)&lt;em&gt;.&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;To read more essays like this, subscribe to &lt;u&gt;&lt;a href="https://every.to/subscribe" rel="noopener noreferrer" target="_blank"&gt;Every&lt;/a&gt;&lt;/u&gt;, and follow us on X at &lt;u&gt;&lt;a href="http://twitter.com/every" rel="noopener noreferrer" target="_blank"&gt;@every&lt;/a&gt;&lt;/u&gt; and on &lt;u&gt;&lt;a href="https://www.linkedin.com/company/everyinc/" rel="noopener noreferrer" target="_blank"&gt;LinkedIn&lt;/a&gt;&lt;/u&gt;.&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;For sponsorship opportunities, reach out to sponsorships@every.to.&lt;/em&gt;&lt;/p&gt;&lt;div class="quill-button" data-source="{&amp;quot;id&amp;quot;:&amp;quot;quill-button-1769187301610&amp;quot;,&amp;quot;url&amp;quot;:&amp;quot;https://every.to/subscribe?source=post_button&amp;quot;,&amp;quot;text&amp;quot;:&amp;quot;Subscribe&amp;quot;}" id="quill-button-1769187301610"&gt;&lt;a href="https://every.to/subscribe?source=post_button"&gt;Subscribe&lt;/a&gt;&lt;/div&gt;</description>
      <author>Mike Taylor / AI &amp; I</author>
      <pubDate>2026-06-17 07:00:00 -0400</pubDate>
      <guid>https://every.to/podcast/transcript-can-github-be-for-everyone</guid>
      <link>https://every.to/podcast/transcript-can-github-be-for-everyone</link>
    </item>
    <item>
      <title>Loops for Non-coders </title>
      <description>&lt;table&gt;&lt;tr&gt;&lt;td&gt;&lt;img alt="Context Window" src="https://d24ovhgu8s7341.cloudfront.net/uploads/publication/logo/94/small_context_windown_1.png" /&gt;&lt;/td&gt;&lt;td&gt;&lt;/td&gt;&lt;td&gt;&lt;table&gt;&lt;tr&gt;&lt;td&gt;by &lt;a href="https://every.to/@laura_27bbaf_1" itemprop="name"&gt;Laura Entis&lt;/a&gt; and &lt;a href="https://every.to/@katie.parrott12" itemprop="name"&gt;Katie Parrott&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;in &lt;a href="https://every.to/context-window"&gt;Context Window&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;figure&gt;&lt;img src="https://d24ovhgu8s7341.cloudfront.net/uploads/post/cover/4304/full_page_cover_7ea40ea394cd3062-CW_Cover_Image_1.png"&gt;&lt;figcaption&gt;Midjourney/Every illustration. &lt;/figcaption&gt;&lt;/figure&gt;&lt;p&gt;AI can be exhilarating and destabilizing. Just when you think you have your setup figured out, a &lt;u&gt;&lt;a href="https://every.to/vibe-check/anthropic-mythos-our-fable-vibe-check" rel="noopener noreferrer" target="_blank"&gt;powerful new model drops&lt;/a&gt;&lt;/u&gt;—or, in the case of Anthropic’s Fable 5, gets &lt;u&gt;&lt;a href="https://every.to/context-window/fable-disabled" rel="noopener noreferrer" target="_blank"&gt;abruptly disabled&lt;/a&gt;&lt;/u&gt;. Today, we explore this instability from multiple angles: Staff writer &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@katie.parrott12" rel="noopener noreferrer" target="_blank"&gt;Katie Parrott&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; maps the grief (and coping mechanisms) that accompanied the Fable ban and shares a practical playbook for the next time a model you depend on disappears, head of growth &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@tedescau" rel="noopener noreferrer" target="_blank"&gt;Austin Tedesco&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; explains how loops are causing him to rethink his approach to working with AI, and GitHub chief operating officer &lt;strong&gt;Kyle Daigle&lt;/strong&gt; tells &lt;em&gt;&lt;u&gt;&lt;a href="https://every.to/podcast" rel="noopener noreferrer" target="_blank"&gt;AI &amp;amp; I&lt;/a&gt;&lt;/u&gt;&lt;/em&gt; guest host &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@mike_2114" rel="noopener noreferrer" target="_blank"&gt;Mike Taylor&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; how the company is responding to an agent-generated surge in commits.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;h2&gt;&lt;strong&gt;‘AI &amp;amp; I’: Can GitHub be for everyone?&lt;/strong&gt;&lt;/h2&gt;&lt;p&gt;Today we’re releasing a new episode of our podcast &lt;em&gt;&lt;u&gt;&lt;a href="https://every.to/podcast" rel="noopener noreferrer" target="_blank"&gt;AI &amp;amp; I&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;. Head of tech consulting &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@mike_2114" rel="noopener noreferrer" target="_blank"&gt;Mike Taylor&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; guest hosted this week and spoke to GitHub COO &lt;strong&gt;Kyle Daigle&lt;/strong&gt; about how the company is responding now that everyone—and their army of agents—can ship code. &lt;/p&gt;&lt;p&gt;The volume is extreme: Last year, there were 1 billion commits on GitHub. This year, that figure will safely exceed 14 billion, Daigle says, which puts GitHub in an important but delicate position: It must help developers handle agent-generated code without dictating which pull requests communities should trust or merge.&lt;/p&gt;&lt;p&gt;Watch on &lt;u&gt;&lt;a href="https://x.com/danshipper/status/2067292771522654626" rel="noopener noreferrer" target="_blank"&gt;X&lt;/a&gt;&lt;/u&gt; or &lt;u&gt;&lt;a href="https://youtu.be/OCEVqy8kl7Q" rel="noopener noreferrer" target="_blank"&gt;YouTube&lt;/a&gt;&lt;/u&gt;, listen on &lt;u&gt;&lt;a href="https://open.spotify.com/episode/62NJTryUh6D8idheRZJm0e?si=NEE6UvQzRym2jnak7gFbUg" rel="noopener noreferrer" target="_blank"&gt;Spotify&lt;/a&gt;&lt;/u&gt; or &lt;u&gt;&lt;a href="https://podcasts.apple.com/us/podcast/githubs-coo-explains-why-ai-hasnt-replaced-developers/id1719789201?i=1000773140257" rel="noopener noreferrer" target="_blank"&gt;Apple Podcasts&lt;/a&gt;&lt;/u&gt;, or &lt;u&gt;&lt;a href="https://every.to/podcast/transcript-can-github-be-for-everyone" rel="noopener noreferrer" target="_blank"&gt;read the transcript&lt;/a&gt;&lt;/u&gt;. And for a behind-the-scenes look at the making of the podcast, check out &lt;u&gt;&lt;a href="https://every.to/also-true-for-humans/i-interviewed-an-ai-version-of-github-s-coo-then-spoke-to-the-real-one" rel="noopener noreferrer" target="_blank"&gt;Mike’s piece&lt;/a&gt;&lt;/u&gt; on his decision to ditch standard-issue prep in favor of building and mock interviewing an AI version of Daigle. &lt;/p&gt;&lt;p&gt;Here are the highlights:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;&lt;strong&gt;The developer versus non-developer distinction is disappearing&lt;/strong&gt;: GitHub has long taken an expansive view of who counts as a developer, but AI has blown up the definition entirely. Legal, finance, sales, and marketing professionals are using the GitHub Copilot app to build prototypes and apps. “A lot of the folks that the industry would call knowledge workers, or just non-developers by trade, are using these tools,” Daigle says. &lt;/li&gt;&lt;li&gt;&lt;strong&gt;Agents can write and review code, but humans decide what ships&lt;/strong&gt;: GitHub has built agentic code review and merge tools to help developers handle the surge of pull requests, but people who run open-source projects should ultimately decide which outside submissions they merge. “We want to provide tools,” Daigle says, “but really leave them in control.”&lt;/li&gt;&lt;li&gt;&lt;strong&gt;Daigle runs a daily loop on himself&lt;/strong&gt;: In AI, a loop is a cycle in which an agent does work, evaluates the result against a goal or standard, incorporates feedback, and repeats the process until the task is complete or the output improves. Daigle uses the same workflow to improve his communication style—each day, an agent reviews a rolling seven-day window of his emails and Slack messages, identifies patterns, provides constructive feedback, and checks whether he incorporated its advice.&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;Miss an episode? Catch up on Dan’s recent conversations with LinkedIn cofounder &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/podcast/reid-hoffman-makes-five-predictions-about-ai-in-2026" rel="noopener noreferrer" target="_blank"&gt;Reid Hoffman&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;; the team that built Claude Code, &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/podcast/how-to-use-claude-code-like-the-people-who-built-it" rel="noopener noreferrer" target="_blank"&gt;Cat Wu&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; and &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/podcast/how-to-use-claude-code-like-the-people-who-built-it" rel="noopener noreferrer" target="_blank"&gt;Boris Cherny&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;; Vercel cofounder &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/podcast/vercel-s-guillermo-rauch-on-what-comes-after-coding" rel="noopener noreferrer" target="_blank"&gt;Guillermo Rauch&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;; podcaster &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/podcast/dwarkesh-patel-s-quest-to-learn-everything" rel="noopener noreferrer" target="_blank"&gt;Dwarkesh Patel&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;; and others, and learn how they use AI to think, create, and relate.&lt;/p&gt;&lt;h3&gt;&lt;hr class="quill-line"&gt;&lt;/h3&gt;&lt;h2&gt;&lt;strong&gt;Inside Every &lt;/strong&gt;&lt;/h2&gt;&lt;h4&gt;&lt;strong&gt;Loops, loops, loops &lt;/strong&gt;&lt;/h4&gt;&lt;p&gt;“I’m super loop-pilled,” says head of growth &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@tedescau" rel="noopener noreferrer" target="_blank"&gt;Austin Tedesco&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;. He’s not alone. &lt;u&gt;&lt;a href="https://every.to/chain-of-thought/the-moral-of-fable" rel="noopener noreferrer" target="_blank"&gt;Loops&lt;/a&gt;&lt;/u&gt;—which have AI tackle a goal through iterative cycles of completing a section of the task, reviewing the results, incorporating the learnings, and generating the next step—have become a hot topic of discussion here at Every in recent days. &lt;/p&gt;&lt;p&gt;While some have been using them for a while—looking at you &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@kieran_1355" rel="noopener noreferrer" target="_blank"&gt;Kieran Klaassen&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;, the father of loop-based engineering philosophy, &lt;u&gt;&lt;a href="https://every.to/guides/compound-engineering" rel="noopener noreferrer" target="_blank"&gt;compound engineering&lt;/a&gt;&lt;/u&gt;—newer, more powerful models have made this approach possible for non-code-based workflows, too. &lt;/p&gt;&lt;p&gt;For Austin, &lt;u&gt;&lt;a href="https://every.to/vibe-check/anthropic-mythos-our-fable-vibe-check" rel="noopener noreferrer" target="_blank"&gt;Fable&lt;/a&gt;&lt;/u&gt;—RIP, at least &lt;u&gt;&lt;a href="https://every.to/context-window/fable-disabled" rel="noopener noreferrer" target="_blank"&gt;temporarily&lt;/a&gt;&lt;/u&gt;—was what made the power of loops click into place. The “aha” moment came when he gave the model a simple prompt: &lt;/p&gt;&lt;blockquote&gt;Build me an NBA simulation game that’s like &lt;em&gt;Football Manager&lt;/em&gt;: basically &lt;em&gt;NBA 2K&lt;/em&gt; Dynasty mode without gameplay. I want to pretend I’m an NBA general manager and make real transactions. Build it through the /LFG flow—the &lt;u&gt;&lt;a href="https://github.com/EveryInc/compound-engineering-plugin" rel="noopener noreferrer" target="_blank"&gt;compound engineering&lt;/a&gt;&lt;/u&gt; workflow that brainstorms, plans, builds, reviews, and improves—then keep looping overnight. &lt;/blockquote&gt;&lt;p&gt;When he woke up, he had a working NBA front-office simulator. What floored him was how the model worked through the task: Fable would hit a stopping point, review what was missing, write itself a new one-paragraph prompt, and keep going. At one point, it realized the game needed NBA-specific salary-cap logic, including rules around teams renouncing free agents before July 1. It also simulated a full season and logged what happened, so it could inspect the game’s behavior. &lt;/p&gt;&lt;div class="quill-block-image" id="quill-block-image-1781719634220" data-source="{&amp;quot;dom_id&amp;quot;:&amp;quot;quill-block-image-1781719634220&amp;quot;,&amp;quot;link&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4304/optimized_49ed53f1-a9ac-4923-b8ff-86c8ae68a320.png&amp;quot;,&amp;quot;image&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4304/optimized_49ed53f1-a9ac-4923-b8ff-86c8ae68a320.png&amp;quot;,&amp;quot;caption&amp;quot;:&amp;quot;Austin’s one-shotted NBA front-office simulator. (Image courtesy of Austin Tedesco.)&amp;quot;,&amp;quot;error&amp;quot;:null}"&gt;&lt;div&gt;&lt;a href="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4304/optimized_49ed53f1-a9ac-4923-b8ff-86c8ae68a320.png" target="_blank" rel="noopener noreferrer"&gt;&lt;img src="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4304/optimized_49ed53f1-a9ac-4923-b8ff-86c8ae68a320.png" alt="Austin’s one-shotted NBA front-office simulator. (Image courtesy of Austin Tedesco.)"&gt;&lt;/a&gt;&lt;figcaption class="quill-image-caption"&gt;Austin’s one-shotted NBA front-office simulator. (Image courtesy of Austin Tedesco.)&lt;/figcaption&gt;&lt;/div&gt;&lt;/div&gt;&lt;p&gt;&lt;br&gt;&lt;/p&gt;&lt;p&gt;Every’s growth team has used the same setup to improve our subscription flow and create a net-new pricing page. We supplied the goal, the context, and the desired output, and the agent looped through the rest.&lt;/p&gt;&lt;p&gt;Even though Fable is temporarily inaccessible, Austin is trying to recreate pieces of the same cycle in Codex with /goal, a feature that gives Codex a persistent objective to work toward across a longer-running session.&lt;/p&gt;&lt;p&gt;Get CEO &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@danshipper" rel="noopener noreferrer" target="_blank"&gt;Dan Shipper&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;’s &lt;u&gt;&lt;a href="https://every.to/p/claude-fable-5-prompt-library?source=post_button" rel="noopener noreferrer" target="_blank"&gt;Fable prompt for loops&lt;/a&gt;&lt;/u&gt; or steal Austin’s workflow: &lt;/p&gt;&lt;ol&gt;&lt;li&gt;&lt;strong&gt;Define the problem and output. &lt;/strong&gt;For example, “Our subscription flow and pricing page aren’t converting the way they should. Find the weak spots and produce new experiments we could ship or mock in Figma.”&lt;/li&gt;&lt;li&gt;&lt;strong&gt;Give the agent a way to review its work. &lt;/strong&gt;Provide the materials it would need to judge whether an idea is good: who it’s building for, what those users are trying to do, what their current experience looks like, where the data says people get stuck, what competitors do differently, and what constraints it needs to work within. Then ask it to test each idea against that context before writing its next prompt.&lt;/li&gt;&lt;li&gt;&lt;strong&gt;Set the loop in motion&lt;/strong&gt;. Tell the agent: “When you hit a stopping point, write yourself a one-paragraph prompt for the next improvement and keep going until the outcome is materially better.” Let it generate pricing-page variants, Figma experiments, or site changes. Then review the outputs and decide what’s good enough to act on. &lt;/li&gt;&lt;/ol&gt;&lt;h3&gt;&lt;hr class="quill-line"&gt;&lt;/h3&gt;&lt;h2&gt;&lt;strong&gt;Pulse Check&lt;/strong&gt;&lt;/h2&gt;&lt;h3&gt;&lt;strong&gt;Your favorite model will die&lt;/strong&gt;&lt;/h3&gt;&lt;p&gt;When Fable—the best coding model &lt;u&gt;&lt;a href="https://every.to/vibe-check/anthropic-mythos-our-fable-vibe-check" rel="noopener noreferrer" target="_blank"&gt;Every had tested&lt;/a&gt;&lt;/u&gt;—shut down on June 12, the tech community lived through a compressed version of the five stages of grief. It was a preview of what you should expect every time a model goes away. Here’s what helped, and what to do the next time the weights vanish.&lt;/p&gt;&lt;h4&gt;&lt;strong&gt;Save the session before you need it&lt;/strong&gt;&lt;/h4&gt;&lt;p&gt;When a model you lean on is about to change, don’t rush to replace it—first, stop throwing away the evidence. The model’s plans, edits, tests, slip-ups, and fixes live on in chat histories in web apps and log files on your computer, and you can reuse that record. &lt;/p&gt;&lt;p&gt;One Fable user, posting as &lt;strong&gt;&lt;u&gt;&lt;a href="https://x.com/nptacek/status/2065997189394907465" rel="noopener noreferrer" target="_blank"&gt;CuddlySalmon&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;, kept a session open from before the shutdown and reported that it still seemed to carry Fable’s context after Claude fell back to &lt;u&gt;&lt;a href="https://every.to/vibe-check/opus-4-8-vibecheck" rel="noopener noreferrer" target="_blank"&gt;Opus 4.8&lt;/a&gt;&lt;/u&gt;—though they admitted that might be wishful thinking. Another used &lt;u&gt;&lt;a href="https://every.to/guides/codex-for-knowledge-work" rel="noopener noreferrer" target="_blank"&gt;Codex&lt;/a&gt;&lt;/u&gt; helper agents to dig design choices out of old Fable runs. &lt;/p&gt;&lt;p&gt;&lt;strong&gt;Do this now:&lt;/strong&gt; Export a few sessions from whatever model you depend on, and save the full trail of what it did, not a tidy summary. The weird little choices are what you’ll want later.&lt;/p&gt;&lt;h4&gt;&lt;strong&gt;Build while you have the opportunity &lt;/strong&gt;&lt;/h4&gt;&lt;p&gt;As the shutdown news hit Every’s company Slack, &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@danshipper" rel="noopener noreferrer" target="_blank"&gt;Dan&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; gave the team one order: “quick BUILD EVERYTHING.” In the three days we had, Every took maximum advantage of the opportunity. We used Fable to build &lt;u&gt;&lt;a href="https://middlepix.vercel.app/" rel="noopener noreferrer" target="_blank"&gt;a &lt;/a&gt;&lt;/u&gt;&lt;em&gt;&lt;u&gt;&lt;a href="https://middlepix.vercel.app/" rel="noopener noreferrer" target="_blank"&gt;Lord of the Rings&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;&lt;u&gt;&lt;a href="https://middlepix.vercel.app/" rel="noopener noreferrer" target="_blank"&gt; version of graphics editor Kid Pix&lt;/a&gt;&lt;/u&gt;, an open-source &lt;u&gt;&lt;a href="https://github.com/EveryInc/hands-on-deck" rel="noopener noreferrer" target="_blank"&gt;deck-building tool&lt;/a&gt;&lt;/u&gt;, and an &lt;u&gt;&lt;a href="https://shader-henna.vercel.app/" rel="noopener noreferrer" target="_blank"&gt;interactive shader tool&lt;/a&gt;&lt;/u&gt;&lt;a href="https://shader-henna.vercel.app/" rel="noopener noreferrer" target="_blank"&gt;, each in an afternoon.&lt;/a&gt;&lt;strong&gt; &lt;/strong&gt;&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Do this now:&lt;/strong&gt; When a new frontier model comes out, give your team the time and space to experiment in the first few days. You may not get another three-day sprint like that. &lt;/p&gt;&lt;h4&gt;&lt;strong&gt;Determine which work needed the frontier model &lt;/strong&gt;&lt;/h4&gt;&lt;p&gt;The morning after the shutdown, Dan asked the team what they’d do if Fable stayed gone. Kieran said that he would use Opus, like before. &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@naveen_6804" rel="noopener noreferrer" target="_blank"&gt;Naveen Naidu&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;, who runs our voice app &lt;strong&gt;&lt;u&gt;&lt;a href="https://monologue.to" rel="noopener noreferrer" target="_blank"&gt;Monologue&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;, said that he would go back to Codex. &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@williewilliams" rel="noopener noreferrer" target="_blank"&gt;Willie Williams&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;, Every’s head of platform, would keep Codex as his daily tool and move the &lt;u&gt;&lt;a href="https://every.to/plus-one" rel="noopener noreferrer" target="_blank"&gt;Plus One&lt;/a&gt;&lt;/u&gt; factory back to Opus.&lt;/p&gt;&lt;p&gt;Most of your work never needed the top model in the first place. Some does. While building &lt;u&gt;&lt;a href="https://github.com/EveryInc/hands-on-deck" rel="noopener noreferrer" target="_blank"&gt;hands-on-deck&lt;/a&gt;&lt;/u&gt;, an open-source slide tool, with Fable, &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@nityesh" rel="noopener noreferrer" target="_blank"&gt;Nityesh Agarwal&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;—Every’s senior AI engineer—noticed that the more ambitious his ask, the more ambitious the model’s output—each one pushed the other higher. On Opus that pull was gone. He could still do the work; he’d just lost the nudge to attempt the bigger version. &lt;/p&gt;&lt;p&gt;&lt;strong&gt;Do this now:&lt;/strong&gt; Take one real Fable-size task and run it on your backup model. If it passes, that work never needed the top model, so stop paying extra for it. If it fails, you’ve found work worth protecting, and you know where to put your effort. &lt;/p&gt;&lt;h4&gt;&lt;strong&gt;Build for the model to disappear&lt;/strong&gt;&lt;/h4&gt;&lt;p&gt;Treat the outage as a fire drill for moving work between models, which means that the work needs to be portable and you need to be able to check for mistakes. &lt;/p&gt;&lt;p&gt;On portability, &lt;strong&gt;&lt;u&gt;&lt;a href="https://www.blockchaincommons.com/about/" rel="noopener noreferrer" target="_blank"&gt;Christopher Allen&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;, who runs the nonprofit Blockchain Commons, built a &lt;u&gt;&lt;a href="https://github.com/ChristopherA/claude-workstream-kit" rel="noopener noreferrer" target="_blank"&gt;Claude workstream kit&lt;/a&gt;&lt;/u&gt; that captures the goal, the to-do list, the choices made, the lessons learned, the current task, the next step, and the blockers—all in plain text files tracked by Git. Because everything lives in the files rather than the chat history, a new model can pick the work right up—nothing key gets stranded in a conversation that no longer exists. On verification: A portable handoff isn’t enough if you can’t tell whether the work is correct. You also need a built-in check—a test of a “done” rule the model can demonstrably pass—that proves it’s so.&lt;/p&gt;&lt;h5&gt;&lt;strong&gt;Do this week:&lt;/strong&gt;&lt;/h5&gt;&lt;ul&gt;&lt;li&gt;&lt;strong&gt;Run one live task through a backup.&lt;/strong&gt; Pick something you do often, with an output you can judge.&lt;/li&gt;&lt;li&gt;&lt;strong&gt;Write down what the new model couldn’t recover.&lt;/strong&gt; Move those choices, examples, preferences, and “done” rules into project files.&lt;/li&gt;&lt;li&gt;&lt;strong&gt;Add a test that proves the job is done.&lt;/strong&gt; The fastest backup is the model that can show its work, even if its style leaves you cold.&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;Fable may come back. Either way, the next model you build on shouldn’t be able to take your work down with it.—&lt;em&gt;&lt;u&gt;&lt;a href="https://every.to/@katie.parrott12" rel="noopener noreferrer" target="_blank"&gt;Katie Parrott&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;&lt;/p&gt;&lt;h3&gt;&lt;hr class="quill-line"&gt;&lt;/h3&gt;&lt;h3&gt;&lt;strong&gt;Discuss&lt;/strong&gt;&lt;/h3&gt;&lt;blockquote&gt;“I feel like I got on the last plane out of Vietnam. I don’t think I can plan for a normal-length career, at least in this field.”—&lt;strong&gt;Christopher Pack&lt;/strong&gt;, software developer, in the &lt;em&gt;&lt;u&gt;&lt;a href="https://www.wsj.com/lifestyle/careers/changing-careers-cutting-expenses-software-engineers-contend-with-ai-3889ce73" rel="noopener noreferrer" target="_blank"&gt;Wall Street Journal&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;&lt;/blockquote&gt;&lt;p&gt;For decades, a tech degree was the surest ticket to a stable, high-paying job. (Hence all the “learn to code” jabs directed at liberal arts majors.) Now, MVP applicants, and entry-level and mid-career software engineers are competing for a shrinking number of available roles as AI consumes more of their jobs. Per the &lt;em&gt;Journal&lt;/em&gt;, even experienced programmers are bracing for layoffs by hoarding cash, making risky stock market bets, or considering a career change.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;&lt;strong&gt;&lt;em&gt;&lt;a href="https://every.to/@laura_27bbaf_1" rel="noopener noreferrer" target="_blank"&gt;Laura Entis&lt;/a&gt;&lt;/em&gt;&lt;/strong&gt;&lt;em&gt; is a staff writer at Every. You can follow her on &lt;a href="https://www.linkedin.com/in/lauraentis/" rel="noopener noreferrer" target="_blank"&gt;LinkedIn&lt;/a&gt;. &lt;/em&gt;&lt;strong&gt;&lt;em&gt;&lt;a href="https://every.to/@katie.parrott12" rel="noopener noreferrer" target="_blank"&gt;Katie Parrott&lt;/a&gt;&lt;/em&gt;&lt;/strong&gt; &lt;em&gt;is a staff writer at Every. You can read more of her work in&lt;/em&gt; &lt;em&gt;&lt;a href="https://katieparrott.substack.com/" rel="noopener noreferrer" target="_blank"&gt;her newsletter&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;To read more essays like this, subscribe to &lt;u&gt;&lt;a href="https://every.to/subscribe" rel="noopener noreferrer" target="_blank"&gt;Every&lt;/a&gt;&lt;/u&gt;, and follow us on X at &lt;u&gt;&lt;a href="http://twitter.com/every" rel="noopener noreferrer" target="_blank"&gt;@every&lt;/a&gt;&lt;/u&gt; and on &lt;u&gt;&lt;a href="https://www.linkedin.com/company/everyinc/" rel="noopener noreferrer" target="_blank"&gt;LinkedIn&lt;/a&gt;&lt;/u&gt;.&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;For sponsorship opportunities, reach out to sponsorships@every.to.&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;Help us scale the only subscription you need to stay at the edge of AI. Explore &lt;u&gt;&lt;a href="https://www.notion.so/Jobs-Every-25cca4f355ac80c5ad6ee7a6e93d6b4e?pvs=21" rel="noopener noreferrer" target="_blank"&gt;open roles at Every&lt;/a&gt;&lt;/u&gt;.&lt;/em&gt;&lt;/p&gt;&lt;div class="quill-button" data-source="{&amp;quot;id&amp;quot;:&amp;quot;quill-button-1769187301610&amp;quot;,&amp;quot;url&amp;quot;:&amp;quot;https://every.to/subscribe?source=post_button&amp;quot;,&amp;quot;text&amp;quot;:&amp;quot;Subscribe&amp;quot;}" id="quill-button-1769187301610"&gt;&lt;a href="https://every.to/subscribe?source=post_button"&gt;Subscribe&lt;/a&gt;&lt;/div&gt;</description>
      <author>Laura Entis and Katie Parrott / Context Window</author>
      <pubDate>2026-06-17 01:00:00 -0400</pubDate>
      <guid>https://every.to/context-window/loops-for-non-coders</guid>
      <link>https://every.to/context-window/loops-for-non-coders</link>
    </item>
    <item>
      <title>We Built Our Own Agent-native Tool. It Overhauled How We Build Software.</title>
      <description>&lt;table&gt;&lt;tr&gt;&lt;td&gt;&lt;/td&gt;&lt;td&gt;&lt;table&gt;&lt;tr&gt;&lt;td&gt;by &lt;a href="https://every.to/@stella.f.garber" itemprop="name"&gt;Stella Garber&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;figure&gt;&lt;img src="https://d24ovhgu8s7341.cloudfront.net/uploads/post/cover/4303/full_page_cover_005c13ae9e6db818-cover-image-concept.png"&gt;&lt;figcaption&gt;Midjourney/Every illustration.&lt;/figcaption&gt;&lt;/figure&gt;&lt;p&gt;&lt;strong&gt;&lt;em&gt;&lt;a href="https://every.to/@stella.f.garber" rel="noopener noreferrer" target="_blank"&gt;Stella Garber&lt;/a&gt;&lt;/em&gt;&lt;/strong&gt;&lt;em&gt; cofounded Hoop, an AI agent to help subcriber brands cut churn, after years at Trello watching what makes software stick. When her team’s customer discovery calls became a mess of scattered notes and competing interpretations, they built an internal AI analysis tool from scratch using Every’s &lt;a href="https://every.to/guides/agent-native" rel="noopener noreferrer" target="_blank"&gt;agent-native architecture&lt;/a&gt; philosophy. It reshaped how they build their actual product. Plus: While you’re waiting for Fable 5 to return, we’ve compiled &lt;a href="https://every.to/p/claude-fable-5-prompt-library" rel="noopener noreferrer" target="_blank"&gt;13 copy-ready prompts&lt;/a&gt; based on the Every team’s workflows. Use them to plan, build, research, verify, and hand off complex work that runs for hours.—&lt;a href="https://every.to/@kate_1767" rel="noopener noreferrer" target="_blank"&gt;Kate Lee&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;&lt;div class="quill-button" data-source="{&amp;quot;id&amp;quot;:&amp;quot;quill-button-1769530239147&amp;quot;,&amp;quot;text&amp;quot;:&amp;quot;Get the Fable prompts&amp;quot;,&amp;quot;url&amp;quot;:&amp;quot;https://every.to/p/claude-fable-5-prompt-library?source=post_button&amp;quot;}" id="quill-button-1769530239147"&gt;&lt;a href="https://every.to/p/claude-fable-5-prompt-library?source=post_button"&gt;Get the Fable prompts&lt;/a&gt;&lt;/div&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;It was Monday morning, and my cofounder Brian was reading from our agent’s weekly analysis of customer discovery calls. “Subscription retention,” he said. “Five separate brands mentioned it as their top priority, and none of them trust existing AI tools to touch it.” &lt;/p&gt;&lt;p&gt;Just weeks ago, unearthing an insight like this would’ve been nearly impossible.&lt;/p&gt;&lt;p&gt;At my pre-product-market fit startup, we’d all been speaking with prospects and trying to figure out the positioning for our product, but keeping track of everything we learned was a mess across founders, platforms, and mediums. To share what we learned during our Monday meeting, Brian would read notes in Slack, collect transcripts from Granola, and try to make sense of it all in Claude Code.&lt;/p&gt;&lt;p&gt;We couldn’t afford to be that disorganized. We’d recently launched &lt;u&gt;&lt;a href="http://hoop.app?utm_source=Every&amp;amp;utm_medium=article&amp;amp;utm_campaign=agentnative" rel="noopener noreferrer" target="_blank"&gt;Hoop&lt;/a&gt;&lt;/u&gt;, an agent that helps subscription brands reduce churn, and we needed to learn as quickly as we could. So we had to talk to as many potential customers as possible, then rigorously document and score each call to separate polite interest from genuine demand. Everyone on our five-person team was putting in the effort, but each of us had our own process, our own tools, and our own interpretation of what happened on each customer call.&lt;/p&gt;&lt;p&gt;So my two cofounders and I—none of us with “engineer” in our title—built an internal tool to fix it. What we didn’t expect was that the tool would change how we built our actual product for customers, too.&lt;/p&gt;&lt;h2&gt;&lt;strong&gt;‘I should build something for this’&lt;/strong&gt;&lt;/h2&gt;&lt;p&gt;Our Monday meetings were so unmethodical because the information from our client calls was in different places depending on who had taken the call and whether they had notes based on Granola or another transcription tool. We had no way to see patterns and draw conclusions about what our potential customers wanted.  &lt;/p&gt;&lt;p&gt;Justin, my cofounder and the resident product expert, built the first version of a tool to bring those notes together in under 10 hours over a few days, fitting it in around his other priorities.&lt;/p&gt;&lt;p&gt;Here’s how it worked: You’d upload a Zoom transcript, the tool would run the transcript through four or five prompts, and you’d get a structured analysis scored against &lt;u&gt;&lt;a href="https://thephysicsofstartups.substack.com/p/the-pull-framework" rel="noopener noreferrer" target="_blank"&gt;the PULL criteria&lt;/a&gt;&lt;/u&gt;—a framework developed at Harvard Business School to help early-stage startups find product-market fit. The tool would also pull together all the conversations with a given prospect into a summary, so you could see the full arc of a relationship instead of just a snapshot from one call. Rather than digging through notes and transcripts, the tool gave us a consolidated analysis week over week to help us see what was working and what wasn’t.&lt;/p&gt;&lt;p&gt;Justin set up the app using tools we hadn’t used before: &lt;u&gt;&lt;a href="http://next.js" rel="noopener noreferrer" target="_blank"&gt;Next.js framework&lt;/a&gt;&lt;/u&gt; with &lt;u&gt;&lt;a href="https://ui.shadcn.com/" rel="noopener noreferrer" target="_blank"&gt;ShadCN components&lt;/a&gt;&lt;/u&gt; for the user interface, Supabase for the database that compiled all the notes, Claude’s API for the analysis. &lt;/p&gt;&lt;p&gt;For Justin, who had studied computer science but wasn’t writing much code anymore, it was an opportunity to dust off his skills and build his confidence with AI-native coding. He started by designing and building the visual interface because he is the kind of person who gets frustrated when software doesn’t look right, even if it functions. He made sure that the look and feel of the tool matched our brand, and got the components (buttons, labels, menus) looking clean before he went anywhere near the data.&lt;/p&gt;&lt;p&gt;Only then did he go straight to the data. He had to make sure that the tool’s analysis of the customer conversations was better than what people were already producing on their own with Claude. Otherwise, we would never convince the whole team to use the same tool. So he created a prompt that he tweaked after manually reviewing the output several times and relying on &lt;u&gt;&lt;a href="https://platform.claude.com/docs/en/build-with-claude/prompt-engineering/claude-prompting-best-practices" rel="noopener noreferrer" target="_blank"&gt;Anthropic’s prompting best practices&lt;/a&gt;&lt;/u&gt; for Claude. &lt;/p&gt;&lt;div class="quill-block-image" id="quill-block-image-1781609285141-y5ljoom2b" data-source="{&amp;quot;dom_id&amp;quot;:&amp;quot;quill-block-image-1781609285141-y5ljoom2b&amp;quot;,&amp;quot;link&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4303/optimized_c2a92b37-2617-401c-828f-28ac7a13de67.png&amp;quot;,&amp;quot;image&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4303/optimized_c2a92b37-2617-401c-828f-28ac7a13de67.png&amp;quot;,&amp;quot;caption&amp;quot;:null,&amp;quot;error&amp;quot;:null}"&gt;&lt;div&gt;&lt;a href="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4303/optimized_c2a92b37-2617-401c-828f-28ac7a13de67.png" target="_blank" rel="noopener noreferrer"&gt;&lt;img src="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4303/optimized_c2a92b37-2617-401c-828f-28ac7a13de67.png" alt="Uploaded image"&gt;&lt;/a&gt;&lt;/div&gt;&lt;/div&gt;&lt;h2&gt;&lt;strong&gt;Still too much friction&lt;/strong&gt;&lt;/h2&gt;&lt;p&gt;The first version of the tool generated high-quality analysis, but too many parts of the process were still manual. You had to download the call transcript from Zoom, upload it manually to the tool, fill in the customer name and call type, and wait several minutes while it was processed. Then you’d create a link and share the analysis in Slack. &lt;/p&gt;&lt;p&gt;The team could search the transcripts and analysis in the tool, but it didn’t return good results. For example, I searched for prospects who’d had bad experiences with AI customer support tools and got no results back, even though I knew a head of customer experience had spent five minutes talking about how embarrassed they were by their AI sending off-brand responses to customers. The tool could only match the exact words in my query, not the meaning behind them.&lt;/p&gt;&lt;p&gt;And there was the classic adoption problem that we know all too well from our years at productivity tool Trello, where we’d previously worked. Justin’s tool was yet another place people had to remember to go, competing with Slack and Notion for our attention. &lt;/p&gt;&lt;h2&gt;&lt;strong&gt;Going agent-native&lt;/strong&gt;&lt;/h2&gt;&lt;p&gt;Then we found the answer to our woes. Justin had been reading about &lt;u&gt;&lt;a href="https://every.to/guides/agent-native" rel="noopener noreferrer" target="_blank"&gt;agent-native architecture&lt;/a&gt;&lt;/u&gt; on Every. Instead of hard-coding a sequence of prompts that run in a fixed order, you give a model a set of tools and let it reason about how to use them. And instead of building a destination app that requires people to come to you, you bring the tool to where people already work, like Slack.&lt;/p&gt;&lt;p&gt;Justin gave Claude Code the link to the article and said that he wanted to build a system that aligned with those architecture principles. The agent needed two tools: one to upload and read a transcript, and one to add and edit a partner profile. With those in place, all users had to do was send a transcript to the app in Slack. The agent confirmed the partner name and call details, then uploaded the transcript, ran the analysis, created a summary page, and posted it to our user feedback channel. &lt;/p&gt;&lt;p&gt;Justin started checking everything he built against the agent-native architecture guidelines, not just the product-market fit tool. He’d go into planning mode with Claude Code, lay out a new feature, and send it alongside the Every article back to Claude Code and ask: “Where is this aligned, and where is it not?” &lt;/p&gt;&lt;p&gt;Sometimes he deviated from the guidelines when he didn’t think that users needed AI for a specific task. For example, the tool tracked LLM token usage and cost—useful information, but not something users needed to query. Exposing it to the agent would have only created confusion.&lt;/p&gt;&lt;h2&gt;&lt;strong&gt;My turn in the codebase&lt;/strong&gt;&lt;/h2&gt;&lt;p&gt;I had a different problem. I needed to see the pipeline at a glance—who to follow up with, where each conversation stood—organized by people and stages, not just chronological call logs.&lt;/p&gt;&lt;p&gt;I opened Ghostty, a simple terminal app, copied the tool’s code so I could work on it locally on my laptop, and—hands a little shaky at the thought of directly editing code—fired up Claude Code. &lt;/p&gt;&lt;p&gt;The first thing I asked Claude Code to build was a lightweight view of the data so I could see the different stages of the pipeline at a glance. Ironically, I was asking Claude Code to build a Trello-like board of the information so I could move cards representing calls around. Old habits die hard. &lt;/p&gt;&lt;p&gt;I built incrementally. I noticed it’d be useful to have the status of a deal on the front of each card. so I prompted Claude Code, “Make sure each card has a one-line summary so I know at a glance where things stand.” Once deployed, deals came in with their status attached. Every time I wanted something different, I just asked. &lt;/p&gt;&lt;p&gt;Encouraged by Justin, I started merging code directly to the shared repo. Sometimes things broke, and nobody got in trouble, because the stakes were low and everyone treated it that way. &lt;/p&gt;&lt;h2&gt;&lt;strong&gt;Brian learns agent-native at 10 p.m.&lt;/strong&gt;&lt;/h2&gt;&lt;p&gt;Next up was Brian, our other cofounder, who was responsible for our weekly user learnings report. Before the new tool, he would often forget and end up scrolling through Notion, or riffing from memory on whatever conversations were freshest in his mind. Our learnings skewed toward recency rather than significance, and we missed patterns that only showed up when you looked at all the calls together. &lt;/p&gt;&lt;p&gt;So Brian built a feature to generate the report automatically. His first version was a link on his computer that spit out a Markdown file he would paste into Notion—it worked, but only for him. His second added a Learnings tab to the tool with a button that produced the same Markdown output, but now anyone could pull a report. &lt;/p&gt;&lt;p&gt;That solved the access problem, but a new one surfaced. The team couldn’t tweak the format or focus of the report without asking Brian to edit the code. Brian wanted to make the prompt for the report editable so the team could tune the reports without touching code. At 9:25 p.m., Justin suggested in Slack: “You should make the prompt an editable field and pass it to the agent.”&lt;/p&gt;&lt;p&gt;Brian didn’t understand what that meant, so he pasted Justin’s message into Claude and asked for an explanation. Claude walked him through the agent-native architecture philosophy, and Brian started to understand: Instead of giving a human a text box to edit the prompt, you give the agent a tool that lets it read and modify the prompt itself. Then anyone can talk to the tool in Slack and tell it what to change.&lt;/p&gt;&lt;p&gt;By 11:15 p.m., Brian had the feature working. He tested it by telling the Slack agent to rewrite the Learnings prompt in Klingon, and sure enough, every report came out in Klingon. It took two hours, from start to finish. &lt;/p&gt;&lt;p&gt;More practically, a recent report flagged that four out of our last six calls mentioned the same pain point: Brands were losing subscribers who hadn’t realized they’d signed up in the first place. Customers often subscribe to unlock a discount, forget it’s recurring, and cancel in surprise when the charge lands—exactly the moment our product could step in with a savings offer or a better-fit recommendation. None of us had connected those dots individually; the pattern only emerged when the tool looked across all the calls at once.&lt;/p&gt;&lt;p&gt;Then something unexpected happened. Duplicate records appeared in the Learnings tab, and Brian mentioned it to the bot in Slack. The bot looked at both records, figured out one was empty, and deleted it—a “find and delete duplicates” function. He had given the agent editing rights to the database, and the model reasoned through the problem on its own. That was the moment it clicked for Brian: If you give a reasoning model simple, powerful tools, it can handle situations you never thought to code for. &lt;/p&gt;&lt;h2&gt;&lt;strong&gt;What we took back to the product&lt;/strong&gt;&lt;/h2&gt;&lt;p&gt;The most important thing the tool did was change how we build our external-facing product. Justin took the agent-native patterns he’d tested internally and built multiple sales tools, including a helpdesk audit we can run with prospective customers to show them the value of Hoop before they sign up. &lt;/p&gt;&lt;p&gt;Brian went from building a Learnings tab to building something like an internal sales agent, a bot we call “Benny” that lives in Slack, taps into our sales tools, and runs tasks on command: qualifying leads, scoring them against our ideal customer profile, and updating Attio (our CRM). I went from building features for a customer relationship manager to having informed opinions about product architecture. &lt;/p&gt;&lt;p&gt;It also kept us honest. When every call is scored and summarized automatically, you can’t hide from what the data is telling you. When five different operators in a single week told us our pricing didn’t make sense, we had to face the facts and change it to win more deals.  At a big company, a team of analysts would be writing quarterly reports off this data. We’re five people doing it weekly with a tool we built ourselves. &lt;/p&gt;&lt;p&gt;The tool will probably look different in another month, and that’s fine. The fluency that we’ve gained matters more: knowing when to give a model a tool versus when to hard-code a workflow, being comfortable with shipping and failing—and trying again. &lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;Stella Garber is cofounder and CEO of Hoop. &lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;To read more essays like this, subscribe to &lt;u&gt;&lt;a href="https://every.to/subscribe" rel="noopener noreferrer" target="_blank"&gt;Every&lt;/a&gt;&lt;/u&gt;, and follow us on X at &lt;u&gt;&lt;a href="http://twitter.com/every" rel="noopener noreferrer" target="_blank"&gt;@every&lt;/a&gt;&lt;/u&gt; and on &lt;u&gt;&lt;a href="https://www.linkedin.com/company/everyinc/" rel="noopener noreferrer" target="_blank"&gt;LinkedIn&lt;/a&gt;&lt;/u&gt;.&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;For sponsorship opportunities, reach out to sponsorships@every.to.&lt;/em&gt;&lt;/p&gt;&lt;div class="quill-button" data-source="{&amp;quot;id&amp;quot;:&amp;quot;quill-button-1769187301610&amp;quot;,&amp;quot;url&amp;quot;:&amp;quot;https://every.to/subscribe?source=post_button&amp;quot;,&amp;quot;text&amp;quot;:&amp;quot;Subscribe&amp;quot;}" id="quill-button-1769187301610"&gt;&lt;a href="https://every.to/subscribe?source=post_button"&gt;Subscribe&lt;/a&gt;&lt;/div&gt;</description>
      <author>Stella Garber</author>
      <pubDate>2026-06-16 03:00:00 -0400</pubDate>
      <guid>https://every.to/p/we-built-our-own-agent-native-tool-it-overhauled-how-we-build-software</guid>
      <link>https://every.to/p/we-built-our-own-agent-native-tool-it-overhauled-how-we-build-software</link>
    </item>
    <item>
      <title>I Interviewed an AI Version of GitHub’s COO—Then Spoke to the Real One</title>
      <description>&lt;table&gt;&lt;tr&gt;&lt;td&gt;&lt;img alt="Also True for Humans" src="https://d24ovhgu8s7341.cloudfront.net/uploads/publication/logo/95/small_ath.png" /&gt;&lt;/td&gt;&lt;td&gt;&lt;/td&gt;&lt;td&gt;&lt;table&gt;&lt;tr&gt;&lt;td&gt;by &lt;a href="https://every.to/@mike_2114" itemprop="name"&gt;Mike Taylor&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;in &lt;a href="https://every.to/also-true-for-humans"&gt;Also True for Humans&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;figure&gt;&lt;img src="https://d24ovhgu8s7341.cloudfront.net/uploads/post/cover/4302/full_page_cover_e3754b8d6aa16b9f-Interviewed_an_AI_Version.png"&gt;&lt;figcaption&gt;Midjourney/Every illustration.&lt;/figcaption&gt;&lt;/figure&gt;&lt;p&gt;I’ve attended many tech conferences as a participant and a speaker, but this year’s Microsoft Build, &lt;u&gt;&lt;a href="https://every.to/also-true-for-humans/how-microsoft-is-building-for-a-world-of-metered-intelligence" rel="noopener noreferrer" target="_blank"&gt;the company’s flagship developer event&lt;/a&gt;&lt;/u&gt;, was my first as a member of the press. &lt;/p&gt;&lt;p&gt;To quell the imposter syndrome, I tried an experiment before I sat down with GitHub chief operating officer &lt;strong&gt;Kyle Daigle&lt;/strong&gt;, a true GitHub veteran who joined the company as a developer 13 years ago. I built a simulated version of Kyle—an AI persona distilled from his public writing, talks, and interviews—and asked the AI Kyle the same questions I planned to ask the real one.&lt;/p&gt;&lt;p&gt;I expected the output to be either eerily accurate or useless. It was neither—precisely what made it valuable.&lt;/p&gt;&lt;p&gt;Out of 12 questions, two responses were strong matches, four were partial matches, and six were material misses. To the simulation’s credit, when it lacked evidence, it said so instead of inventing something. Those holes were the most useful prep—they showed me what information wasn’t available on the public record, and therefore where I should spend my time in the live interview. &lt;/p&gt;&lt;p&gt;I’ve spent a lot of time talking to AI personas. My last startup, &lt;u&gt;&lt;a href="http://askrally.com" rel="noopener noreferrer" target="_blank"&gt;Ask Rally&lt;/a&gt;&lt;/u&gt;, was a virtual focus group tool. We found that AI is no substitute for the real thing, but in high-stakes scenarios, roleplay can help you get out of your own head, build confidence in your strategy, and avoid costly mistakes. We’re more predictable than we think, with some &lt;u&gt;&lt;a href="https://arxiv.org/abs/2411.10109" rel="noopener noreferrer" target="_blank"&gt;studies showing 85 percent accuracy&lt;/a&gt;&lt;/u&gt; in AI personas replicating real human responses.&lt;/p&gt;&lt;p&gt;What follows is the actual interview, with notes on what the simulation got right, what it missed, and where the comparison is interesting. We also went back to human Kyle—and his take surprised us more than the AI answers. &lt;/p&gt;&lt;h2&gt;&lt;strong&gt;1. Expanding the definition of a developer&lt;/strong&gt;&lt;/h2&gt;&lt;p&gt;&lt;strong&gt;Mike:&lt;/strong&gt; The demographics of the customer are changing. A lot of people who may never have used GitHub or developer products before are now using them. How has that changed the way you decide the product roadmap?&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Kyle Daigle:&lt;/strong&gt; GitHub has always had an expansive view of what a developer is. I started as a developer before I would have called myself a dev. I was writing code for myself, and I did not go to school for computer science. I was going to art school and wrote code to pay for art school.&lt;/p&gt;&lt;p&gt;That journey is important: I can create tools with a team and deliver them to people who want to build an app for themselves, their family, a startup, or a business. GitHub has serious developer tools used by the largest businesses, but when I look at something like the GitHub Copilot app, I see both developers running multiple projects and agent sessions and people on our legal or finance teams using it. Customers tell us the same thing. People the industry might call knowledge workers, or non-developers by trade, are using these tools to build little apps or assets.&lt;/p&gt;&lt;p&gt;Our focus is still very much on developers, but we want to make it easier for people to try writing code. There should always be an on-ramp into creating software, including through tools like the GitHub Copilot app.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Simulation note:&lt;/strong&gt; A partial match. The AI Kyle correctly predicted the real Kyle’s thesis: AI is expanding who gets to build software—and even offered a framework to test this that the real Kyle plausibly could have mentioned: &lt;em&gt;“The design test I keep coming back to is ‘no net new behavior.’ New capabilities should fit into the places where software work already happens.”&lt;/em&gt; But it couldn’t produce the art-school story or the legal-and-finance-teams example that made the real answer compelling. &lt;/p&gt;&lt;h2&gt;&lt;strong&gt;2. Helping maintainers handle a flood of pull requests&lt;/strong&gt;&lt;/h2&gt;&lt;p&gt;&lt;strong&gt;Mike:&lt;/strong&gt; How do you help developers deal with the burden of all the extra pull requests? Open source maintainers I talk to are drowning. What needs to happen to help them?&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Kyle Daigle:&lt;/strong&gt; For developers generally, we are building tools like Copilot code review. It is now agentic, so it finds more novel vulnerabilities, and you can comment and have the agent implement a change. Code review is an overlooked way to get pull requests into a state where they are much easier to review.&lt;/p&gt;&lt;p&gt;Agentic merge is another example. A pull request can be almost ready, but there are still manual steps to finish processing it. Instead, I can define what GitHub Copilot is allowed to do and tell it to merge the pull request, wait for CI, and wait for policies.&lt;/p&gt;&lt;p&gt;Open source has a unique set of needs because maintainers do not control who sends changes. We are focused on giving maintainers more control: whether they want to accept pull requests, who they want to accept them from, and how much work a contributor needs to do to demonstrate that a contribution will be meaningful. Every community is choosing a slightly different approach. GitHub wants to provide the building blocks and leave maintainers in control. If a standard practice emerges, we can cement a system around it, but we do not want to impose one first.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Simulation note:&lt;/strong&gt; The AI got the governing principle right and the substance wrong. Its best line—&lt;em&gt;“The system should give maintainers explicit rules and guardrails, not just a larger inbox”&lt;/em&gt;—is something the real Kyle could have said. But it named zero products. Copilot code review, agentic merge, contributor acceptance controls: all invisible to a persona built from public material, because they weren’t publicized until the event.&lt;/p&gt;&lt;h2&gt;&lt;strong&gt;3. Growth in agent-generated activity&lt;/strong&gt;&lt;/h2&gt;&lt;p&gt;&lt;strong&gt;Mike:&lt;/strong&gt; You have a front-row seat to this new agent economy. You said publicly that you have had more pull requests submitted in a month than in all of last year. How are those stats exploding?&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Kyle Daigle:&lt;/strong&gt; We are seeing much more activity on GitHub. Last October at GitHub Universe, we shared that there had been one billion commits on GitHub for the full year. We are on track for 14 billion if growth is linear this year, which it will not be. In March, 17 million pull requests were created by agents alone.&lt;/p&gt;&lt;p&gt;There is much more code being created. Sometimes people dismiss it as slop: code pushed up that nobody cares about. That is not really true. We are leaving the super-early-adoption stage. We are not at the peak, but we are climbing the hill and learning what we can build when it is not just Kyle building, but Kyle plus one, two, or N agents using my skills, resources, and context.&lt;/p&gt;&lt;p&gt;We are investing heavily in preparing for the next wave of growth because this does not seem to be growing and then plateauing. No matter where people build or what tools they use, the code ends up on GitHub for sharing and collaboration. We need to support everyone’s agent moment, not just GitHub Copilot.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Simulation note:&lt;/strong&gt; Miss. The AI recited last year’s public numbers—it can only re-serve the stats you already have. &lt;/p&gt;&lt;h2&gt;&lt;strong&gt;4. Business models for always-on agents&lt;/strong&gt;&lt;/h2&gt;&lt;p&gt;&lt;strong&gt;Mike:&lt;/strong&gt; How does the business model change? Freemium makes sense in a human-centered world where we go to bed, but agents are still working while we are asleep. Does that move things toward usage-based pricing?&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Kyle Daigle:&lt;/strong&gt; I do not think we know yet. Right now, Kyle has a license or uses GitHub.com for free, and we have always had API rate limits. That is usually where people see agent backpressure.&lt;/p&gt;&lt;p&gt;If you want to do much more, with something like 150 agents doing things at once, we want to enable that. At the same time, I want you to have a great core GitHub experience, with some amount of agent usage as a necessary part of it. It is similar to how GitHub evolved from free public repositories but no free private repositories to giving individuals free private repositories because people reasonably had code they did not want to put out into the world.&lt;/p&gt;&lt;p&gt;GitHub evolves as the industry and community evolve. We focus on making sure developers have what they need to be successful, then work with enterprises to make sure they have what they need at scale. Those needs are usually somewhat different.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Simulation note:&lt;/strong&gt; Both Kyles opened with the same thought: Nobody knows yet. But the real Kyle’s answer was so much more illustrative because of the hypothetical examples he used to explain his point. &lt;/p&gt;&lt;h2&gt;&lt;strong&gt;5. A dual role across GitHub and Microsoft&lt;/strong&gt;&lt;/h2&gt;&lt;p&gt;&lt;strong&gt;Mike:&lt;/strong&gt; Pricing leads back into the wider Microsoft orbit. You now have a dual role, with partial responsibility for the wider marketing organization. How has that changed your work, and how do you prioritize between the two roles?&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Kyle Daigle:&lt;/strong&gt; I have been at GitHub for 13 years, as a developer myself and leading engineering teams for much of that time. What has always been unique about GitHub is that we focus intensely on the developer. Enterprises buying the tools is great, but we are not building for the buyers; we are building for the users.&lt;/p&gt;&lt;p&gt;That has been my focus as GitHub COO, which I continue to do. As Microsoft’s chief marketing officer of developer, my goal is to look across Microsoft’s developer tools and technology and make sure we bring holistic solutions that feel authentic to developer experiences.&lt;/p&gt;&lt;p&gt;At events like Build, we have taken a different approach this year. We are in San Francisco, and the vibe is different from a traditional conference-hall setup. The focus is: Can I go to a session? Can I use the thing? I do not want to be pitched a thing; I have to be able to use it. The goal is to bring GitHub’s expertise, care, and focus on the developer to a broader impact across Microsoft.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Simulation note:&lt;/strong&gt; A clean miss. The AI Kyle questioned my facts: &lt;em&gt;“The available persona... does not document a dual role with partial responsibility for a wider marketing organization. I would verify that premise before treating it as established.”&lt;/em&gt; Proof that simulations based on public materials have an expiry date. &lt;/p&gt;&lt;h2&gt;&lt;strong&gt;6. Community speakers at Build&lt;/strong&gt;&lt;/h2&gt;&lt;p&gt;&lt;strong&gt;Mike:&lt;/strong&gt; Did I hear you say that this is the first Build conference to have external contributors and speakers?&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Kyle Daigle:&lt;/strong&gt; It is the first Build where, by intention, we have focused on speakers from the community in primary sessions. The keynote included people such as Peter, and there were sessions from Svelte and others.&lt;/p&gt;&lt;p&gt;Software development is a team sport. It would be silly to think that one company or one group, including GitHub or Microsoft, could answer every question. We all use open source and build on major open source projects. We should invite people in to tell their part of the story together. That is what developers want, and feedback shows that people are excited to see perspectives from Microsoft, GitHub, and outside the company at the same event.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Simulation note:&lt;/strong&gt; The AI declined to answer: &lt;em&gt;“I cannot confirm that from the available persona.”&lt;/em&gt; But that’s what you want from a simulation—for it to admit when it doesn’t know. &lt;/p&gt;&lt;h2&gt;&lt;strong&gt;7. Differentiating in a competitive market&lt;/strong&gt;&lt;/h2&gt;&lt;p&gt;&lt;strong&gt;Mike:&lt;/strong&gt; This is a very competitive market, and the pace of change is quick. How do you differentiate?&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Kyle Daigle:&lt;/strong&gt; We continue to focus on our roots: developer choice, building for builders, and enabling builders. We went from an era of many APIs and broad access into a somewhat unintentional walled-garden setup, where people develop an affinity for one tool and then realize that trying something interesting elsewhere means learning a new tool or opening a new account.&lt;/p&gt;&lt;p&gt;We want developers building with GitHub to use other tools, and we will partner with everyone to make that as simple as possible. Other companies do similar things, but our ability to support choice across the entirety of building software, not only code generation or collaboration and review, is a real strength.&lt;/p&gt;&lt;p&gt;We will invest in our own technology, including new Microsoft AI models, while continuing to partner with Anthropic, OpenAI, Google, and anyone bringing a model or coding agent to market. We will let developers bring those tools to GitHub or use them through GitHub and GitHub Copilot. Choice is core. If we back down on that, developers will still choose; they will just be stuck in another walled garden.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Simulation note:&lt;/strong&gt; This answer was the strongest match of the 12 to human Kyle’s response. The AI Kyle said: &lt;em&gt;“The differentiator is not having one more agent in a crowded market...That means choice across providers rather than lock-in to a single model.”&lt;/em&gt; The real Kyle even reached for the same image—walled gardens. This was the part of an interview I least needed the simulated interview for.&lt;/p&gt;&lt;h2&gt;&lt;strong&gt;8. Dogfooding while staying open to other tools&lt;/strong&gt;&lt;/h2&gt;&lt;p&gt;&lt;strong&gt;Mike:&lt;/strong&gt; There was a recent news cycle about Claude Code licenses being canceled. How do you make the trade-off between dogfooding your own products, such as your new models or the GitHub Copilot desktop app, and letting developers experiment with other tools?&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Kyle Daigle:&lt;/strong&gt; We all use a variety of tools because otherwise you lose track and become too focused on your own work. I have been a daily MacBook user for years, and I use Windows PCs when I play video games. When I got this role, I set up my Mac, a PC, and an AlmaLinux box. I code most Saturdays, and I swap between the boxes because I want to understand each experience. I use the GitHub Copilot app only on Windows because developers on Windows deserve great apps too; it is not only about the Mac audience.&lt;/p&gt;&lt;p&gt;That is true across our teams. We look at coding agents, harnesses, desktop apps, memory management, and everything else. Everyone is building and using these tools. We put most of our energy into our own tools, but it is a blind spot to focus so narrowly that you lose perspective. When something new comes out, I want to know why people are having a great experience with it. That helps me understand where to focus and why a developer would pick a particular tool. The same goes for our teams.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Simulation note:&lt;/strong&gt; This was the second strongest match. The AI nailed the concept in a line the real Kyle would happily sign his name to: &lt;em&gt;“Dogfooding should sharpen the product. It should not become a reason to ignore the tools developers find useful.”&lt;/em&gt; It could not have known, however, that the COO of GitHub spends Saturdays rotating between a Mac, a PC, and an AlmaLinux box. &lt;/p&gt;&lt;h2&gt;&lt;strong&gt;9. Filtering short-lived ideas from longer-term bets&lt;/strong&gt;&lt;/h2&gt;&lt;p&gt;&lt;strong&gt;Mike:&lt;/strong&gt; A lot of these ideas are relatively short-lived, while enterprise product-development cycles are longer-lived. How do you filter ideas and decide what to pursue?&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Kyle Daigle:&lt;/strong&gt; In the short term, we are focused on supporting a multitude of agent sessions because that seems clear: everyone is doing it, so how can we cement it? Over the longer term, models will continue to improve, token economics will become a bigger factor in what people use, and I believe we are not far from having serious ability to use something above a small language model on a local device for some work.&lt;/p&gt;&lt;p&gt;If there is that much optionality around tokens, the consistent truth since ChatGPT and GitHub Copilot emerged is personalization, context, fine-tuning with context, and memory. There have been experiments across the industry, but not a long-term vision. Supporting many agents is important because someone using agents will not sit and stare at one agent working. But that alone will not produce a great long-term experience. A great experience is using an agent that feels like it is completing a thought for you without forcing you to codify that thought yourself.&lt;/p&gt;&lt;p&gt;The agent should be able to intuit that, or use post-training, fine-tuning, or frontier tuning to deeply understand how I work. Sometimes we work on the short term, and sometimes we take repeated attempts at the long term until something tangible helps us move forward.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Simulation note:&lt;/strong&gt; This is where the simulation’s favorite crutch broke down. The AI Kyle reached—again—for human Kyle’s best-documented framework: &lt;em&gt;“I use a constraint-first hierarchy: MUST, SHOULD, and COULD.”&lt;/em&gt; It used that framework in five of my 12 questions, and the real Kyle never mentioned it once. When a persona runs out of evidence, it over-applies the pattern it knows best. That’s a subtler failure than hallucination, but a failure nonetheless. As models get better, they should get less fixated on individual pieces of context and relax rules like humans do. &lt;/p&gt;&lt;h2&gt;&lt;strong&gt;10. Hill climbing as a product-development loop&lt;/strong&gt;&lt;/h2&gt;&lt;p&gt;&lt;strong&gt;Mike:&lt;/strong&gt; I heard the term “hill climbing” 100 times yesterday. Can you talk about how that became such a big focus?&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Kyle Daigle:&lt;/strong&gt; &lt;strong&gt;Satya [Nadella]&lt;/strong&gt;, &lt;strong&gt;Mustafa [Suleyman]&lt;/strong&gt;, and &lt;strong&gt;Jacob [Andreou]&lt;/strong&gt; leading the Copilot group talk about it often. The biggest thing we have learned is that using the tools has to be a core way to improve the underlying experience of the models. The evals are essential. Thumbs-up and thumbs-down data, whether people accept a suggestion, and how much they accept all contribute to creating a useful experience for everyone.&lt;/p&gt;&lt;p&gt;Every week we talk about hill-climbing results. We look at hard measures and soft measures because sometimes evals and rubrics show improvement while user sentiment crashes, even with the same latency and performance. You can overfit.&lt;/p&gt;&lt;p&gt;The goal is to run that loop quickly, then give everyone a hill-climbing machine without forcing them to do it the hard way. In an enterprise using M365, there is potentially rich data in assets, documents, and chats. Turning on something like frontier tuning and using an MAI Phi-3 model as a base can show real results without extra work. At first, that sounded like a magic trick that could not be real. But sometimes the opportunity is where something feels too simple to work.&lt;/p&gt;&lt;p&gt;That is why we say hill climbing so much. It is not a moonshot. It is climb, improve, add an eval, improve, add new data, and improve again. That is how we reached a point where we can launch models for ourselves and allow customers to use similar tooling.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Simulation note:&lt;/strong&gt; The AI declined to bluff by saying, &lt;em&gt;“The available persona does not document me using ‘hill climbing’ as a specific organizational term, so I would not manufacture an origin story for it,” &lt;/em&gt;then provided some abstract explanations of fast iteration. The real answer included new information that the model would never have had access to. &lt;/p&gt;&lt;h2&gt;&lt;strong&gt;11. Keeping AI subscriptions affordable&lt;/strong&gt;&lt;/h2&gt;&lt;p&gt;&lt;strong&gt;Mike:&lt;/strong&gt; Is hill climbing the answer to stopping a $200 subscription from becoming a $2,000 subscription?&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Kyle Daigle:&lt;/strong&gt; Frontier tuning models so they know you better is part of the answer. Another important part is helping developers automatically choose models.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Mike:&lt;/strong&gt; Like the model router in GitHub Copilot?&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Kyle Daigle:&lt;/strong&gt; Exactly. GitHub has an automatic model router with task intent, and Microsoft Foundry has a model router at the API level. The more we can let people tell us where their bars are, such as “This is an incredibly hard problem and I am willing to go all the way to the top,” or, “I want to stay here,” the more we can help choose the model.&lt;/p&gt;&lt;p&gt;Tokens are often expensive because people choose the model of the day, week, or hour, and those models can be expensive. But a train of thought moves between hard problems and simple ones. I might ask an agent to do an enormous amount of work, then have a final step that is just changing names. I probably will not manually switch from an expensive model down to a smaller one just to save tokens for that step, but the tools could. That will help enterprises, individual developers, and people building automations with the Copilot SDK.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Simulation note:&lt;/strong&gt; The AI got the gist of the problem correct, but it couldn’t solve it. It gave a framing for the problem, noting that a model doing more work doesn’t automatically mean a cheaper bill, while the real Kyle talked about a thing to solve the problem: the automatic model router &lt;/p&gt;&lt;h2&gt;&lt;strong&gt;12. Unusual personal uses for agents&lt;/strong&gt;&lt;/h2&gt;&lt;p&gt;&lt;strong&gt;Mike:&lt;/strong&gt; I made an AI version of you to practice this interview and found it immensely useful. What other unusual things are you seeing people do with agents internally or externally?&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Kyle Daigle:&lt;/strong&gt; I do something similar. I have one setup through the app and another Claude instance that cannot talk to work systems, so there is separation of state. I spend a lot of time having it read what I write. This interview will ultimately get fed into it. Every day I receive a communications report that says things like, “Kyle, you keep saying this,” or, “This is not super clear.” Because I write and speak in a particular way and like to use metaphors, it gives me examples of metaphors that are clear.&lt;/p&gt;&lt;p&gt;The self-improvement loop for a human is incredibly powerful. We used to talk about this with Hubot and ChatOps at GitHub: Humans are much more willing to take critical feedback from robots than from other humans. When my Claude instance tells me how poorly I did at something, I feel better asking it to explain why and using that feedback when I write emails, write a script, or review details.&lt;/p&gt;&lt;p&gt;A lot of my agent loop is about me rather than software. It looks backward: read Kyle’s emails and Slack messages from the last seven days, give feedback, then look back at the advice and check whether Kyle acted on it. That loop is extremely powerful. It is the kind of personal consumer AI experience I want.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Simulation note:&lt;/strong&gt; Funny that the question about simulated interviews was also a miss. Asked to name unusual agent uses, the AI Kyle admitted it didn’t have much to offer and fell back on vague talk about removing toil. The real Kyle’s answer revealed a concrete, personal workflow that’s truly interesting and helpful. &lt;/p&gt;&lt;h2&gt;&lt;strong&gt;The best interview preparation is knowing where to dig deeper &lt;/strong&gt;&lt;/h2&gt;&lt;p&gt;Every managing editor &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@eleanor_b03474_1" rel="noopener noreferrer" target="_blank"&gt;Eleanor Warnock&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; wrote recently about what she calls &lt;u&gt;&lt;a href="https://every.to/p/socrates-as-a-service" rel="noopener noreferrer" target="_blank"&gt;Socrates-as-a-service&lt;/a&gt;&lt;/u&gt;: the human skill of pulling out ideas people haven’t yet put into words, like the anecdote that becomes a front-page story or the detail that crystallizes a philosophy into something a reader remembers. &lt;/p&gt;&lt;p&gt;That’s exactly the gap this experiment helped me see. The simulation knew about Kyle’s positions on developer choice and walled gardens because he’s made them clear in public for years. It couldn’t know that he codes on Saturdays and rotates between three machines to stay honest about other people’s tools. &lt;/p&gt;&lt;p&gt;In response to a post-real interview request for comment about the experiment, Daigle said, “I thought the simulated interview was pretty good! Mostly, it overindexed on my written work rather than my spoken interviews and podcasts. Without access to everything that I’ve written internally, it went harder on topics I’ve spoken about on my blog, etc. than I normally speak about.” &lt;/p&gt;&lt;p&gt;This is the real argument for doing an AI-simulated test run before an interview. It will show you where the gaps in the public record are, so you can spend the interview filling in those gaps and extracting truly original, scarce knowledge for your readers and the world. &lt;/p&gt;&lt;p&gt;Daigle found a use for my answers, too. “Even reading the AI responses, I found it clarifying my thinking and the sharpness of my answers, so it helped me, too.” &lt;/p&gt;&lt;p&gt;Then he added: “I actually did a similar thing for Mike. What questions I expected him to ask. I didn’t save the output—I garbage-collect a lot of it—so it’s funny how we’re all operating.”&lt;/p&gt;&lt;p&gt;Maybe the joke was on me all along. &lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;The simulated Kyle Daigle was built from public material current through May 31, 2026, using a &lt;u&gt;&lt;a href="https://github.com/nickwinder/synthteam" rel="noopener noreferrer" target="_blank"&gt;modified version of the open-source library SynthTeam&lt;/a&gt;&lt;/u&gt;, and scored against the real transcript afterward: two strong matches, four partial, six misses. Full methodology and the unabridged synthetic interview are available on request.&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;&lt;strong&gt;&lt;em&gt;&lt;u&gt;&lt;a href="https://every.to/@mike_2114" rel="noopener noreferrer" target="_blank"&gt;Mike Taylor&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;&lt;/strong&gt; &lt;em&gt;is the head of tech consulting at Every and a co-author of &lt;/em&gt;&lt;u&gt;&lt;a href="https://www.oreilly.com/library/view/prompt-engineering-for/9781098153427/" rel="noopener noreferrer" target="_blank"&gt;Prompt Engineering for Generative AI&lt;/a&gt;&lt;/u&gt; (O’Reilly)&lt;em&gt;. To read more essays like this, subscribe to &lt;u&gt;&lt;a href="https://every.to/subscribe" rel="noopener noreferrer" target="_blank"&gt;Every&lt;/a&gt;&lt;/u&gt;, and follow us on X at &lt;u&gt;&lt;a href="http://twitter.com/every" rel="noopener noreferrer" target="_blank"&gt;@every&lt;/a&gt;&lt;/u&gt; and on &lt;u&gt;&lt;a href="https://www.linkedin.com/company/everyinc/" rel="noopener noreferrer" target="_blank"&gt;LinkedIn&lt;/a&gt;&lt;/u&gt;.&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;We also do AI training, adoption, and innovation for companies. &lt;u&gt;&lt;a href="https://every.to/consulting?utm_source=emailfooter" rel="noopener noreferrer" target="_blank"&gt;Work with us&lt;/a&gt;&lt;/u&gt; to bring AI into your organization.&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;For sponsorship opportunities, reach out to sponsorships@every.to.&lt;/em&gt;&lt;/p&gt;&lt;div class="quill-button" data-source="{&amp;quot;id&amp;quot;:&amp;quot;quill-button-1769187301610&amp;quot;,&amp;quot;url&amp;quot;:&amp;quot;https://every.to/subscribe?source=post_button&amp;quot;,&amp;quot;text&amp;quot;:&amp;quot;Subscribe&amp;quot;}" id="quill-button-1769187301610"&gt;&lt;a href="https://every.to/subscribe?source=post_button"&gt;Subscribe&lt;/a&gt;&lt;/div&gt;</description>
      <author>Mike Taylor / Also True for Humans</author>
      <pubDate>2026-06-15 06:00:00 -0400</pubDate>
      <guid>https://every.to/also-true-for-humans/i-interviewed-an-ai-version-of-github-s-coo-then-spoke-to-the-real-one</guid>
      <link>https://every.to/also-true-for-humans/i-interviewed-an-ai-version-of-github-s-coo-then-spoke-to-the-real-one</link>
    </item>
    <item>
      <title>Fable, Disabled</title>
      <description>&lt;table&gt;&lt;tr&gt;&lt;td&gt;&lt;img alt="Context Window" src="https://d24ovhgu8s7341.cloudfront.net/uploads/publication/logo/94/small_context_windown_1.png" /&gt;&lt;/td&gt;&lt;td&gt;&lt;/td&gt;&lt;td&gt;&lt;table&gt;&lt;tr&gt;&lt;td&gt;by &lt;a href="https://every.to/@Every%20Staff" itemprop="name"&gt;Every Staff&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;in &lt;a href="https://every.to/context-window"&gt;Context Window&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;figure&gt;&lt;img src="https://d24ovhgu8s7341.cloudfront.net/uploads/post/cover/4301/full_page_cover_acb73bf3c2eeb1b9-Context_Window_Cover_Image.png"&gt;&lt;figcaption&gt;Midjourney/Every illustration.&lt;/figcaption&gt;&lt;/figure&gt;&lt;p&gt;On Friday night, the U.S. government banned Anthropic’s distribution of Fable 5 and Mythos 5 to non-U.S. nationals. In response, Anthropic disabled Fable for &lt;em&gt;all&lt;/em&gt; customers. As of this writing, the situation is ongoing. &lt;/p&gt;&lt;p&gt;It remains to be seen how it will play out, but I can already see the difference in my AI usage. Here’s a graph comparing my Claude and Codex usage before and after the ban (the “event” below):&lt;/p&gt;&lt;div class="quill-block-image" id="quill-block-image-1781398787200" data-source="{&amp;quot;dom_id&amp;quot;:&amp;quot;quill-block-image-1781398787200&amp;quot;,&amp;quot;link&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4301/optimized_0412b932-1e94-42ee-8720-c7a8f840ed2f.png&amp;quot;,&amp;quot;image&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4301/optimized_0412b932-1e94-42ee-8720-c7a8f840ed2f.png&amp;quot;,&amp;quot;caption&amp;quot;:&amp;quot;Image courtesy of Dan Shipper.&amp;quot;,&amp;quot;error&amp;quot;:null}"&gt;&lt;div&gt;&lt;a href="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4301/optimized_0412b932-1e94-42ee-8720-c7a8f840ed2f.png" target="_blank" rel="noopener noreferrer"&gt;&lt;img src="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4301/optimized_0412b932-1e94-42ee-8720-c7a8f840ed2f.png" alt="Image courtesy of Dan Shipper."&gt;&lt;/a&gt;&lt;figcaption class="quill-image-caption"&gt;Image courtesy of Dan Shipper.&lt;/figcaption&gt;&lt;/div&gt;&lt;/div&gt;&lt;p&gt;&lt;br&gt;&lt;/p&gt;&lt;p&gt;Before the ban, I was split about evenly between Claude and Codex. After (and after a period where I was using neither because I was sleeping), I switched almost entirely to Codex. &lt;/p&gt;&lt;p&gt;My guess is that this ban is not going to last very long. It seems to rest on a &lt;u&gt;&lt;a href="https://www.wsj.com/tech/ai/amazon-ceos-talks-with-u-s-officials-triggered-crackdown-on-anthropic-models-dcc90578?st=9r4tnM&amp;amp;reflink=desktopwebshare_permalink" rel="noopener noreferrer" target="_blank"&gt;misunderstanding between the government and Anthropic&lt;/a&gt;&lt;/u&gt; about which kinds of guardrail bypasses are fixable and what counts as an adequate solution. Anthropic believes the jailbreak identified by the government is narrow rather than universal—it surfaces only minor vulnerabilities that other public models are already susceptible to. The government apparently believes otherwise. Because both sides are highly incentivized to work this out, I’d bet that the ban is revoked after a few days—and demand for the newly returned Fable skyrockets. &lt;/p&gt;&lt;p&gt;However, this kind of move is extremely disruptive and distracting for people working at Anthropic. The only comparable scenario I can remember is &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/everything/thinksgiving-is-upon-us" rel="noopener noreferrer" target="_blank"&gt;Sam Altman&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;&lt;u&gt;&lt;a href="https://every.to/everything/thinksgiving-is-upon-us" rel="noopener noreferrer" target="_blank"&gt;’s firing&lt;/a&gt;&lt;/u&gt;, which was resolved relatively quickly. Even though Altman was reinstated, I do think the chaos disrupted the company’s momentum for months afterward.&lt;/p&gt;&lt;p&gt;We’ll keep a close eye on whether the same is true here.—&lt;em&gt;&lt;u&gt;&lt;a href="https://every.to/@danshipper" rel="noopener noreferrer" target="_blank"&gt;Dan Shipper&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;Was this newsletter forwarded to you? &lt;u&gt;&lt;a href="https://every.to/account" rel="noopener noreferrer" target="_blank"&gt;Sign up&lt;/a&gt;&lt;/u&gt; to get it in your inbox&lt;/em&gt;.&lt;/p&gt;&lt;h2&gt;&lt;hr class="quill-line"&gt;&lt;/h2&gt;&lt;h2&gt;Knowledge base&lt;/h2&gt;&lt;p&gt;&lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/vibe-check/anthropic-mythos-our-fable-vibe-check" rel="noopener noreferrer" target="_blank"&gt;“Vibe Check: Fable 5 Is the Best Coding Model in the World”&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; &lt;em&gt;by &lt;u&gt;&lt;a href="https://every.to/@danshipper" rel="noopener noreferrer" target="_blank"&gt;Dan Shipper&lt;/a&gt;&lt;/u&gt; and &lt;u&gt;&lt;a href="https://every.to/@katie.parrott12" rel="noopener noreferrer" target="_blank"&gt;Katie Parrott&lt;/a&gt;&lt;/u&gt;/&lt;u&gt;&lt;a href="https://every.to/vibe-check" rel="noopener noreferrer" target="_blank"&gt;Vibe Check&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;: Anthropic’s Fable 5, the first model in its Mythos class, tops every model Every has tested on coding. It’s also overpowered, and overpriced, for most everyday work. Dan and &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@katie.parrott12" rel="noopener noreferrer" target="_blank"&gt;Katie Parrott&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; ran it through Every’s standard evaluations and landed on one rule: Reach for it on long, complex, minimally supervised tasks, and stay with GPT-5.5 or Opus 4.8 for the rest. Read this for the head-to-head verdict and when Fable 5 is worth the cost.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/context-window/how-to-get-the-most-out-of-fable-5" rel="noopener noreferrer" target="_blank"&gt;“How to Get the Most Out of Fable 5”&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; &lt;em&gt;by &lt;u&gt;&lt;a href="https://every.to/@laura_27bbaf_1" rel="noopener noreferrer" target="_blank"&gt;Laura Entis&lt;/a&gt;&lt;/u&gt;/&lt;u&gt;&lt;a href="https://every.to/context-window" rel="noopener noreferrer" target="_blank"&gt;Context Window&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;: The fastest way to be let down by Fable 5 is to prompt it like GPT-5.5. It rewards organized context, a clear definition of done, and room to run, shown through four worked examples: senior engineer &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@nityesh" rel="noopener noreferrer" target="_blank"&gt;Nityesh Agarwal&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; repairing a broken PowerPoint-generation workflow, head of growth &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@tedescau" rel="noopener noreferrer" target="_blank"&gt;Austin Tedesco&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; building a go-to-market strategy from scratch, &lt;strong&gt;&lt;u&gt;&lt;a href="https://cora.computer" rel="noopener noreferrer" target="_blank"&gt;Cora&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; general manager &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@kieran_1355" rel="noopener noreferrer" target="_blank"&gt;Kieran Klaassen&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; batching scattered product feedback into one set of changes, and head of platform &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@williewilliams" rel="noopener noreferrer" target="_blank"&gt;Willie Williams&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; building from a detailed original plan. It comes with a copy-ready &lt;u&gt;&lt;a href="https://every.to/p/claude-fable-5-prompt-library" rel="noopener noreferrer" target="_blank"&gt;Claude Fable 5 prompt library&lt;/a&gt;&lt;/u&gt; to start from. Read this for the before-and-after prompts.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/context-window/ai-everywhere-all-at-once" rel="noopener noreferrer" target="_blank"&gt;“AI Everywhere, All at Once”&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; &lt;em&gt;by &lt;u&gt;&lt;a href="https://every.to/@laura_27bbaf_1" rel="noopener noreferrer" target="_blank"&gt;Laura Entis&lt;/a&gt;&lt;/u&gt;/&lt;u&gt;&lt;a href="https://every.to/context-window" rel="noopener noreferrer" target="_blank"&gt;Context Window&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;: Four Every team members rewire their setups for Fable 5: Austin saves it for hours-long “rocket launcher” projects, Kieran makes it the middle of his “AI sandwich,” Willie argues the one thing even the best models can’t do is vibe, and head of tech consulting &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@mike_2114" rel="noopener noreferrer" target="_blank"&gt;Mike Taylor&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; keeps it away from confidential client work, since its environment can retain context past a task and break an NDA. Plus a dispatch from Apple’s developer conference, where &lt;strong&gt;&lt;u&gt;&lt;a href="https://monologue.to" rel="noopener noreferrer" target="_blank"&gt;Monologue&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; general manager &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@naveen_6804" rel="noopener noreferrer" target="_blank"&gt;Naveen Naidu&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; finds a Google-powered Siri that’s finally good and Apple opens free on-device AI to smaller apps, and senior designer &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@daniel_5fbd21_1" rel="noopener noreferrer" target="_blank"&gt;Daniel Rodrigues&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; also breaks down how he builds Every’s animated hero images. Read this for the four setups and the Apple developer shift.&lt;/p&gt;&lt;p&gt;🎧 🖥 &lt;strong&gt;&lt;u&gt;&lt;a href="https://www.youtube.com/watch?v=XWpTgCvgYaE" rel="noopener noreferrer" target="_blank"&gt;“How Anthropic Uses Claude Fable 5 With Mike Krieger”&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; &lt;em&gt;by &lt;u&gt;&lt;a href="https://every.to/@danshipper" rel="noopener noreferrer" target="_blank"&gt;Dan Shipper&lt;/a&gt;&lt;/u&gt;/&lt;u&gt;&lt;a href="https://every.to/podcast" rel="noopener noreferrer" target="_blank"&gt;AI &amp;amp; I&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;: &lt;strong&gt;Mike Krieger&lt;/strong&gt;—Instagram’s cofounder and now head of Anthropic Labs—shows Dan how Anthropic itself works with Fable 5, handing the model long, ambitious jobs at night and trusting they’ll be done by morning. Watch or listen to this for how the people who built the model use it. 🎧 🖥 Listen on &lt;u&gt;&lt;a href="https://open.spotify.com/episode/7s1VcIHp1q6PG9hofb2fVY?si=DsAlKVymRs2-J0cnM25M6w&amp;amp;nd=1&amp;amp;dlsi=4383c06a09314ba1" rel="noopener noreferrer" target="_blank"&gt;Spotify&lt;/a&gt;&lt;/u&gt; or &lt;u&gt;&lt;a href="https://podcasts.apple.com/us/podcast/how-anthropic-uses-claude-fable-5-with-mike-krieger/id1719789201?i=1000772067637" rel="noopener noreferrer" target="_blank"&gt;Apple Podcasts&lt;/a&gt;&lt;/u&gt;, watch on &lt;u&gt;&lt;a href="https://www.youtube.com/watch?v=XWpTgCvgYaE" rel="noopener noreferrer" target="_blank"&gt;YouTube&lt;/a&gt;&lt;/u&gt;, or follow the discussion &lt;u&gt;&lt;a href="https://x.com/danshipper/status/2064761654789681281" rel="noopener noreferrer" target="_blank"&gt;on X&lt;/a&gt;&lt;/u&gt;.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/working-overtime/my-editor-caught-me-sounding-like-ai-now-ai-catches-me-first" rel="noopener noreferrer" target="_blank"&gt;“My Editor Caught Me Sounding Like AI. Now AI Catches Me First.”&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; &lt;em&gt;by &lt;u&gt;&lt;a href="https://every.to/@katie.parrott12" rel="noopener noreferrer" target="_blank"&gt;Katie Parrott&lt;/a&gt;&lt;/u&gt;/&lt;u&gt;&lt;a href="https://every.to/working-overtime" rel="noopener noreferrer" target="_blank"&gt;Working Overtime&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;: Katie&lt;strong&gt; &lt;/strong&gt;found a list of her own writing tics—symmetrical sentences, throat-clearing introductions, and phrases that sound like something but say nothing—in a shared document with her editor. So she built a Spiral skill that encodes those tells and flags them before her editor has to. Read this for the build-your-own-AI-editor workflow.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/chain-of-thought/the-moral-of-fable" rel="noopener noreferrer" target="_blank"&gt;“The Moral of Fable”&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; &lt;em&gt;by &lt;u&gt;&lt;a href="https://every.to/@danshipper" rel="noopener noreferrer" target="_blank"&gt;Dan Shipper&lt;/a&gt;&lt;/u&gt;/&lt;u&gt;&lt;a href="https://every.to/chain-of-thought" rel="noopener noreferrer" target="_blank"&gt;Chain of Thought&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;: Fable 5 is the best coding model in the world, but to most knowledge workers it feels incremental. The people within Every feeling its full force have changed how they work: Kieran now hands whole projects to agents and folds what he learns into the next run. The catch is cost: Fable burns tokens fast enough that running it takes serious capital, which could put the frontier out of reach for most. Read this for what AI-native work looks like at its best, and who can afford it.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;h2&gt;Log on&lt;/h2&gt;&lt;p&gt;Get hands-on with how Every uses AI. These are the &lt;u&gt;&lt;a href="https://every.to/events" rel="noopener noreferrer" target="_blank"&gt;live camps, workshops, and meetups&lt;/a&gt;&lt;/u&gt; where team members teach the workflows behind our work.&lt;/p&gt;&lt;h5&gt;Upcoming camp&lt;/h5&gt;&lt;ul&gt;&lt;li&gt;&lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/events/codex-camp-our-power-user-guide" rel="noopener noreferrer" target="_blank"&gt;Codex Power User Camp&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;: On June 26, Dan and the Every team host a two-hour live walkthrough of the &lt;u&gt;&lt;a href="https://every.to/guides/codex-for-knowledge-work?source=post_button" rel="noopener noreferrer" target="_blank"&gt;Codex power-user guide&lt;/a&gt;&lt;/u&gt;, including setup, workflows, and Codex-native app development. &lt;u&gt;&lt;a href="https://every.to/events/codex-camp-our-power-user-guide" rel="noopener noreferrer" target="_blank"&gt;Learn more and register&lt;/a&gt;&lt;/u&gt;.&lt;/li&gt;&lt;/ul&gt;&lt;h5&gt;Recordings you may have missed&lt;/h5&gt;&lt;ul&gt;&lt;li&gt;&lt;strong&gt;Fable 5 Camp&lt;/strong&gt;: On June 12, Dan and the Every team hosted a live walkthrough of how to get the most out of Fable 5—setup, prompting, and the workflows built around long, hands-off tasks. &lt;a href="https://every.to/events/fable-5-power-user-camp" rel="noopener noreferrer" target="_blank"&gt;Watch the recording&lt;/a&gt;.&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;h2&gt;Alignment&lt;/h2&gt;&lt;p&gt;&lt;strong&gt;Uncharted waters. &lt;/strong&gt;Last week, Anthropic, now worth close to $1 trillion, asked the world to &lt;u&gt;&lt;a href="https://www.wsj.com/tech/ai/anthropic-urges-global-pause-in-ai-development-flags-self-improvement-risk-99cefb73" rel="noopener noreferrer" target="_blank"&gt;pause AI development&lt;/a&gt;&lt;/u&gt; because of how close we may be to AI recursively improving itself. In the same week, a team at Columbia University &lt;u&gt;&lt;a href="https://www.nytimes.com/2026/06/04/science/embryos-gene-editing-crispr.html?unlocked_article_code=1.n1A.FM9j.7C1NNe9ikUm1" rel="noopener noreferrer" target="_blank"&gt;edited the genes&lt;/a&gt;&lt;/u&gt; of human embryos, correcting mutations tied to high cholesterol and a blood disorder by swapping a single letter of DNA.&lt;/p&gt;&lt;p&gt;Depending on your disposition, this is either exhilarating or terrifying—probably both. I don’t know what the right guardrails are, or whether I believe they’re actually enforceable. But each week as I read the headlines, some part of me desperately wants to go back to 2008, when I was in high school and the most destabilizing thing in my world was beefing with my arch-nemesis on Facebook. That world is long gone, and isn’t coming back. We’re in uncharted waters now, and the only thing left to decide is whether we sail with our eyes open or closed.—&lt;em&gt;&lt;u&gt;&lt;a href="https://x.com/Ashwinreads" rel="noopener noreferrer" target="_blank"&gt;Ashwin Sharma&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;That’s all for this week! Be sure to follow Every on X at &lt;u&gt;&lt;a href="https://twitter.com/every" rel="noopener noreferrer" target="_blank"&gt;@every&lt;/a&gt;&lt;/u&gt; and on &lt;u&gt;&lt;a href="https://www.linkedin.com/company/everyinc/" rel="noopener noreferrer" target="_blank"&gt;LinkedIn&lt;/a&gt;&lt;/u&gt;.&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;We &lt;u&gt;&lt;a href="https://every.to/studio" rel="noopener noreferrer" target="_blank"&gt;build AI tools&lt;/a&gt;&lt;/u&gt; for readers like you. Write brilliantly with &lt;u&gt;&lt;a href="https://writewithspiral.com/" rel="noopener noreferrer" target="_blank"&gt;Spiral&lt;/a&gt;&lt;/u&gt;. Organize files automatically with &lt;u&gt;&lt;a href="https://makeitsparkle.co/?utm_source=everyfooter" rel="noopener noreferrer" target="_blank"&gt;Sparkle&lt;/a&gt;&lt;/u&gt;. Deliver yourself from email with &lt;u&gt;&lt;a href="https://cora.computer" rel="noopener noreferrer" target="_blank"&gt;Cora&lt;/a&gt;&lt;/u&gt;. Dictate effortlessly with &lt;u&gt;&lt;a href="https://monologue.to" rel="noopener noreferrer" target="_blank"&gt;Monologue&lt;/a&gt;&lt;/u&gt;. Work on documents with AI agents using &lt;u&gt;&lt;a href="https://www.proofeditor.ai/?source=post_button" rel="noopener noreferrer" target="_blank"&gt;Proof&lt;/a&gt;&lt;/u&gt;.&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;For sponsorship opportunities, reach out to sponsorships@every.to.&lt;/em&gt;&lt;/p&gt;&lt;div class="quill-button" data-source="{&amp;quot;id&amp;quot;:&amp;quot;quill-button-1781288785355&amp;quot;,&amp;quot;text&amp;quot;:&amp;quot;Upgrade to paid&amp;quot;,&amp;quot;url&amp;quot;:&amp;quot;https://every.to/subscribe?source=post_button&amp;quot;}" id="quill-button-1781288785355"&gt;&lt;a href="https://every.to/subscribe?source=post_button"&gt;Upgrade to paid&lt;/a&gt;&lt;/div&gt;</description>
      <author>Every Staff / Context Window</author>
      <pubDate>2026-06-14 08:00:00 -0400</pubDate>
      <guid>https://every.to/context-window/fable-disabled</guid>
      <link>https://every.to/context-window/fable-disabled</link>
    </item>
    <item>
      <title>The Moral of Fable</title>
      <description>&lt;table&gt;&lt;tr&gt;&lt;td&gt;&lt;img alt="Chain of Thought" src="https://d24ovhgu8s7341.cloudfront.net/uploads/publication/logo/59/small_chain_of_thought_logo.png" /&gt;&lt;/td&gt;&lt;td&gt;&lt;/td&gt;&lt;td&gt;&lt;table&gt;&lt;tr&gt;&lt;td&gt;by &lt;a href="https://every.to/@danshipper" itemprop="name"&gt;Dan Shipper&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;in &lt;a href="https://every.to/chain-of-thought"&gt;Chain of Thought&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;figure&gt;&lt;img src="https://d24ovhgu8s7341.cloudfront.net/uploads/post/cover/4300/full_page_cover_bd6a75920f540cbc-lush_aesop_cover.png"&gt;&lt;figcaption&gt;Midjourney/Every illustration.&lt;/figcaption&gt;&lt;/figure&gt;&lt;p&gt;&lt;em&gt;Was this newsletter forwarded to you? &lt;u&gt;&lt;a href="https://every.to/account" rel="noopener noreferrer" target="_blank"&gt;Sign up&lt;/a&gt;&lt;/u&gt; to get it in your inbox.&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;As any child who’s heard &lt;strong&gt;Aesop&lt;/strong&gt; knows, the point of a fable is its moral. We see the consequences of falsely crying wolf. We learn why slow and steady wins the hare race.&lt;/p&gt;&lt;p&gt;So what is the moral of Claude Fable 5, Anthropic’s newest model, which this week we called &lt;u&gt;&lt;a href="https://every.to/vibe-check/anthropic-mythos-our-fable-vibe-check" rel="noopener noreferrer" target="_blank"&gt;the best coding model in the world&lt;/a&gt;&lt;/u&gt;?&lt;/p&gt;&lt;p&gt;For engineers, the case is easy to make. For many knowledge workers, though, Fable might feel incremental. You may have one-shotted an impressive demo or two, but you’re probably not using it for your day-to-day work. Why would you? It costs twice as much and the results aren’t that much better.&lt;/p&gt;&lt;p&gt;But there is a certain class of developers who are feeling Fable’s full force. These are people like &lt;strong&gt;&lt;u&gt;&lt;a href="https://cora.computer" rel="noopener noreferrer" target="_blank"&gt;Cora&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; general manager &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@kieran_1355" rel="noopener noreferrer" target="_blank"&gt;Kieran Klaassen&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;, who are suddenly churning through his backlog of bug fixes and feature requests in hours instead of days. “This is my favorite model ever,” he told me.&lt;/p&gt;&lt;p&gt;What’s the difference between Kieran and everyone else? The difference between Kieran and most people using Fable isn’t simply that he’s a developer but that he’s at Level 7 or 8 on &lt;u&gt;&lt;a href="https://every.to/guides/the-eight-levels-of-ai-adoption" rel="noopener noreferrer" target="_blank"&gt;our scale of AI use&lt;/a&gt;&lt;/u&gt;: He delegates whole projects, lets agents work asynchronously, reviews the results, and feeds what he learns into the next run. In other words he writes—dare I say it, &lt;u&gt;&lt;a href="https://x.com/steipete/status/2063697162748260627" rel="noopener noreferrer" target="_blank"&gt;loops&lt;/a&gt;&lt;/u&gt;—not prompts.&lt;/p&gt;&lt;p&gt;For now, this might make Fable seem like a tool for developers. But in AI, developer workflows &lt;u&gt;&lt;a href="https://every.to/context-window/one-app-to-rule-all-knowledge-work" rel="noopener noreferrer" target="_blank"&gt;have a habit&lt;/a&gt;&lt;/u&gt; of spreading to the rest of knowledge work. Claude Code started as a developer tool, and now the same methodology is being used for everything from &lt;u&gt;&lt;a href="https://every.to/also-true-for-humans/you-are-the-most-expensive-model" rel="noopener noreferrer" target="_blank"&gt;slide decks to spreadsheets&lt;/a&gt;&lt;/u&gt; inside of Cowork and Codex.&lt;/p&gt;&lt;p&gt;If you’re not feeling Fable’s force, that’s probably because you haven’t yet started to treat your work like gardening.&lt;/p&gt;&lt;h2&gt;Gardening and loops&lt;/h2&gt;&lt;p&gt;Two years ago I &lt;u&gt;&lt;a href="https://every.to/chain-of-thought/capability-blindness-and-the-future-of-creativity" rel="noopener noreferrer" target="_blank"&gt;wrote&lt;/a&gt;&lt;/u&gt;:&lt;/p&gt;&lt;blockquote&gt;“This era of creativity is going to look more like gardening. A gardener doesn’t grow plants directly. Instead, she sets up the conditions for the garden to grow.”&lt;/blockquote&gt;&lt;p&gt;When you garden, you’re creating a loop. You water and weed, prune and stake, establishing and maintaining the optimal environment; the plant itself does the growing..&lt;/p&gt;&lt;p&gt;Likewise, developers are now creating the optimal conditions for a product to grow and improve. They’re collecting inputs like user feedback, &lt;u&gt;&lt;a href="https://every.to/context-window/how-to-get-the-most-out-of-fable-5" rel="noopener noreferrer" target="_blank"&gt;turning it into actionable plans&lt;/a&gt;&lt;/u&gt;, which then become the basis for AI to do its work. They also create a system for reviewing that work, merging the changes, and &lt;u&gt;&lt;a href="https://every.to/source-code/compound-engineering-the-definitive-guide" rel="noopener noreferrer" target="_blank"&gt;integrating the lessons.&lt;/a&gt;&lt;/u&gt; Each stage’s output becomes the next one’s input; the developer’s role becomes determining the best order and process to help the product flourish. The process becomes the product.&lt;/p&gt;&lt;p&gt;You can already measure this. As soon as he’d grasped Fable’s capabilities, Kieran set a new goal for Cora: Fix any reported papercut or bug within 24 hours. So far, Fable’s held up its end of the bargain: The median time from a bug report to a merged fix is five hours over the last week, per our internal measurements. A product built this way is like a wind-up toy in reverse: Instead of winding down after you release it, every turn of the loop gives it more energy and momentum.&lt;/p&gt;&lt;p&gt;Tending the work is the loop. You decide what goes in, what it can reach, and when it’s done, Swap the fixing bugs for making copy edits or writing quarterly sales forecasts, and the same shape holds for any knowledge work.&lt;/p&gt;&lt;h2&gt;The end of the flat frontier&lt;/h2&gt;&lt;p&gt;Working in the way Fable demands opens up completely new realms of achievement for individuals and small teams. It is another step toward &lt;u&gt;&lt;a href="https://every.to/chain-of-thought/ai-and-the-age-of-the-individual" rel="noopener noreferrer" target="_blank"&gt;the world I described in 2022&lt;/a&gt;&lt;/u&gt;, in which “the consequence of models making existing work cheaper is that individuals can undertake projects that previously belonged only to organizations.”&lt;/p&gt;&lt;p&gt;While Every’s products are each run primarily &lt;u&gt;&lt;a href="https://every.to/chain-of-thought/the-two-slice-team" rel="noopener noreferrer" target="_blank"&gt;by a single engineer&lt;/a&gt;&lt;/u&gt;, Fable changes the size of the swings each one can take. Cora is the cleanest example. Over the last two months Kieran has built an entire email inbox from scratch on the web and iOS. The nearest competitor, Superhuman, &lt;u&gt;&lt;a href="https://www.lennysnewsletter.com/p/superhumans-secret-to-success-rahul-vohra" rel="noopener noreferrer" target="_blank"&gt;spent more than two years&lt;/a&gt;&lt;/u&gt; building its inbox before it even made mobile a top priority. AI made this possible even before Fable came out, but Kieran’s process &lt;u&gt;&lt;a href="https://every.to/context-window/how-to-get-the-most-out-of-fable-5#example-3-turn-feedback-into-batched-changes" rel="noopener noreferrer" target="_blank"&gt;has dramatically accelerated&lt;/a&gt;&lt;/u&gt; now that it’s available.&lt;/p&gt;&lt;p&gt;But the same model that expands what one person can do also raises the question of which people get to do it.&lt;/p&gt;&lt;p&gt;For the last several decades, there wasn’t a huge difference between cutting edge technology and what was widely available to the public. A billionaire and the average person used, more or less, the same MacBook. Now the top end is vastly different, far more expensive, and partly unavailable.&lt;/p&gt;&lt;p&gt;Not only is Fable twice as expensive per token as &lt;u&gt;&lt;a href="https://every.to/vibe-check/opus-4-8-vibecheck" rel="noopener noreferrer" target="_blank"&gt;Opus 4.8&lt;/a&gt;&lt;/u&gt;, it’s also incredibly &lt;em&gt;token hungry&lt;/em&gt;—it’s not uncommon for it to automatically spin up dozens of subagents to check its own work. This means that using Fable for real work requires capital. We’ve seen an asymmetry like this before, but not in technology. It usually shows up in the labor market: The top engineer in the world might command a salary orders of magnitude higher than even a good engineer.&lt;/p&gt;&lt;p&gt;Each of Aesop’s fables ends with a pithy sentence that states, outright, the moral of the story. This one doesn’t close as cleanly. Developer workflows are spreading to all knowledge work, but there’s a definite lag between the two; there’s new power in the hands of individuals, and potentially new distance between those hands and ones that can afford to run the frontier continuously.&lt;/p&gt;&lt;p&gt;Maybe it’s more accurate to say that the moral for Anthropic’s Fable is still being written.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;&lt;strong&gt;&lt;em&gt;&lt;a href="https://every.to/@danshipper" rel="noopener noreferrer" target="_blank"&gt;Dan Shipper&lt;/a&gt;&lt;/em&gt;&lt;/strong&gt; &lt;em&gt;is the cofounder and CEO of Every, where he writes the&lt;/em&gt; &lt;em&gt;&lt;a href="https://every.to/chain-of-thought" rel="noopener noreferrer" target="_blank"&gt;Chain of Thought&lt;/a&gt;&lt;/em&gt; &lt;em&gt;column and hosts the podcast&lt;/em&gt; &lt;a href="https://open.spotify.com/show/5qX1nRTaFsfWdmdj5JWO1G" rel="noopener noreferrer" target="_blank"&gt;AI &amp;amp; I&lt;/a&gt;. &lt;em&gt;You can follow him on X at&lt;/em&gt; &lt;em&gt;&lt;a href="https://twitter.com/danshipper" rel="noopener noreferrer" target="_blank"&gt;@danshipper&lt;/a&gt;&lt;/em&gt; &lt;em&gt;and on&lt;/em&gt; &lt;em&gt;&lt;a href="https://www.linkedin.com/in/danshipper/" rel="noopener noreferrer" target="_blank"&gt;LinkedIn&lt;/a&gt;. &lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;To read more essays like this, subscribe to &lt;u&gt;&lt;a href="https://every.to/subscribe" rel="noopener noreferrer" target="_blank"&gt;Every&lt;/a&gt;&lt;/u&gt;, and follow us on X at &lt;u&gt;&lt;a href="http://twitter.com/every" rel="noopener noreferrer" target="_blank"&gt;@every&lt;/a&gt;&lt;/u&gt; and on &lt;u&gt;&lt;a href="https://www.linkedin.com/company/everyinc/" rel="noopener noreferrer" target="_blank"&gt;LinkedIn&lt;/a&gt;&lt;/u&gt;.&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;We &lt;u&gt;&lt;a href="https://every.to/studio" rel="noopener noreferrer" target="_blank"&gt;build AI tools&lt;/a&gt;&lt;/u&gt; for readers like you. Write brilliantly with &lt;/em&gt;&lt;strong&gt;&lt;em&gt;&lt;u&gt;&lt;a href="https://writewithspiral.com/" rel="noopener noreferrer" target="_blank"&gt;Spiral&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;&lt;/strong&gt;&lt;em&gt;. Organize files automatically with &lt;/em&gt;&lt;strong&gt;&lt;em&gt;&lt;u&gt;&lt;a href="https://makeitsparkle.co/?utm_source=everyfooter" rel="noopener noreferrer" target="_blank"&gt;Sparkle&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;&lt;/strong&gt;&lt;em&gt;. Deliver yourself from email with &lt;u&gt;&lt;a href="https://cora.computer/" rel="noopener noreferrer" target="_blank"&gt;Cora&lt;/a&gt;&lt;/u&gt;. Dictate effortlessly with &lt;/em&gt;&lt;strong&gt;&lt;em&gt;&lt;u&gt;&lt;a href="https://monologue.to/" rel="noopener noreferrer" target="_blank"&gt;Monologue&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;&lt;/strong&gt;&lt;em&gt;. Collaborate with agents on documents with &lt;/em&gt;&lt;strong&gt;&lt;em&gt;&lt;a href="https://www.proofeditor.ai/" rel="noopener noreferrer" target="_blank"&gt;Proof&lt;/a&gt;&lt;/em&gt;&lt;/strong&gt;&lt;em&gt;. &lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;For sponsorship opportunities, reach out to sponsorships@every.to.&lt;/em&gt;&lt;/p&gt;&lt;div class="quill-button" data-source="{&amp;quot;id&amp;quot;:&amp;quot;quill-button-1769187301610&amp;quot;,&amp;quot;url&amp;quot;:&amp;quot;https://every.to/subscribe?source=post_button&amp;quot;,&amp;quot;text&amp;quot;:&amp;quot;Subscribe&amp;quot;}" id="quill-button-1769187301610"&gt;&lt;a href="https://every.to/subscribe?source=post_button"&gt;Subscribe&lt;/a&gt;&lt;/div&gt;</description>
      <author>Dan Shipper / Chain of Thought</author>
      <pubDate>2026-06-12 15:00:00 -0400</pubDate>
      <guid>https://every.to/chain-of-thought/the-moral-of-fable</guid>
      <link>https://every.to/chain-of-thought/the-moral-of-fable</link>
    </item>
    <item>
      <title>AI Everywhere, All at Once</title>
      <description>&lt;table&gt;&lt;tr&gt;&lt;td&gt;&lt;img alt="Context Window" src="https://d24ovhgu8s7341.cloudfront.net/uploads/publication/logo/94/small_context_windown_1.png" /&gt;&lt;/td&gt;&lt;td&gt;&lt;/td&gt;&lt;td&gt;&lt;table&gt;&lt;tr&gt;&lt;td&gt;by &lt;a href="https://every.to/@laura_27bbaf_1" itemprop="name"&gt;Laura Entis&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;in &lt;a href="https://every.to/context-window"&gt;Context Window&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;figure&gt;&lt;img src="https://d24ovhgu8s7341.cloudfront.net/uploads/post/cover/4298/full_page_cover_30fc047788374370-Thu_Cover_Image.png"&gt;&lt;figcaption&gt;Midjourney/Every illustration.&lt;/figcaption&gt;&lt;/figure&gt;&lt;p&gt;Anthropic’s Mythos-level Fable 5 is &lt;u&gt;&lt;a href="https://every.to/vibe-check/anthropic-mythos-our-fable-vibe-check" rel="noopener noreferrer" target="_blank"&gt;here&lt;/a&gt;&lt;/u&gt;, which means we’re experimenting with how to get the most of the super-capable, token-hungry model. Today, four Every team members share their approaches, plus we package eight Fable workflows into &lt;u&gt;&lt;a href="https://every.to/p/claude-fable-5-prompt-library" rel="noopener noreferrer" target="_blank"&gt;prompts you can test out for yourself&lt;/a&gt;&lt;/u&gt;. Elsewhere, &lt;strong&gt;&lt;u&gt;&lt;a href="https://www.monologue.to/" rel="noopener noreferrer" target="_blank"&gt;Monologue&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; general manager &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@naveen_6804" rel="noopener noreferrer" target="_blank"&gt;Naveen Naidu&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; reports from the ground at Apple’s developer conference on why Siri is—wait for it—finally good, and head of platform &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@williewilliams" rel="noopener noreferrer" target="_blank"&gt;Willie Williams&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; argues the one thing even the most powerful LLMs can’t do is vibe.&lt;/p&gt;&lt;p&gt;&lt;em&gt;Was this newsletter forwarded to you? &lt;u&gt;&lt;a href="https://every.to/account" rel="noopener noreferrer" target="_blank"&gt;Sign up&lt;/a&gt;&lt;/u&gt; to get it in your inbox.&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;h3&gt;&lt;strong&gt;Inside Every&lt;/strong&gt;&lt;/h3&gt;&lt;h4&gt;&lt;strong&gt;Fable 5 versus everything else&lt;/strong&gt;&lt;/h4&gt;&lt;p&gt;Anytime there’s a major new model release, there’s pressure to reconsider your AI setup. Or, if you’ve just come out of a meditation retreat, maybe your entire &lt;u&gt;&lt;a href="https://x.com/danshipper/status/2011791802550923579" rel="noopener noreferrer" target="_blank"&gt;life&lt;/a&gt;&lt;/u&gt;.  &lt;/p&gt;&lt;p&gt;Should you swap out your preferred model for the newest arrival? Is the new model sufficiently better to make the switch if you don’t like the &lt;u&gt;&lt;a href="https://every.to/chain-of-thought/inside-anthropic-s-2026-developer-conference" rel="noopener noreferrer" target="_blank"&gt;harness&lt;/a&gt;&lt;/u&gt;? &lt;/p&gt;&lt;p&gt;Fable 5 has thrown the Every team into a new round of existential questioning. It’s an obvious first choice for certain projects—those that are large, complex, and &lt;/p&gt;&lt;p&gt;delegable—and an arguably worse, too-expensive fit for others.&lt;/p&gt;&lt;p&gt;After a week of testing the model, most of us at Every have settled into a two-prong approach: Fire up Fable for ambitious assignments, let it do its thing, and reach for your favored coding agent for smaller-scale, iterative tasks. &lt;/p&gt;&lt;p&gt;&lt;strong&gt;Head of growth &lt;u&gt;&lt;a href="https://every.to/@tedescau" rel="noopener noreferrer" target="_blank"&gt;Austin Tedesco&lt;/a&gt;&lt;/u&gt;’s breakdown: &lt;/strong&gt;Fable 5 demands “a very different way of approaching knowledge work,” one that requires fine-tuning exactly what outcomes you want from the model, what information it needs to execute, and trusting it enough to sit back and let it cook. &lt;/p&gt;&lt;p&gt;So far, Austin’s reserved Fable 5 for “rocket launcher” projects that can run for four-plus hours, like building an NBA front office simulation game, or researching and executing growth experiments overnight. With the model, he typically uses &lt;u&gt;&lt;a href="https://every.to/guides/compound-engineering" rel="noopener noreferrer" target="_blank"&gt;compound engineering’s LFG flow&lt;/a&gt;&lt;/u&gt;, which has the agent brainstorm, plan, work, review, and repeat.&lt;/p&gt;&lt;p&gt;The Codex app remains his daily driver. Austin has a setup where, when a meeting ends, Codex retrieves the action items, decides whether it can handle any of them on its own, and, if so, starts a new thread to do the work. He also uses Codex with the &lt;strong&gt;&lt;u&gt;&lt;a href="https://writewithspiral.com/" rel="noopener noreferrer" target="_blank"&gt;Spiral&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; MCP for drafting Every’s social copy, internal strategy documents, and most same-day tasks.&lt;/p&gt;&lt;div class="quill-block-image" id="quill-block-image-1781190018683-04qej4n0i" data-source="{&amp;quot;dom_id&amp;quot;:&amp;quot;quill-block-image-1781190018683-04qej4n0i&amp;quot;,&amp;quot;link&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4298/optimized_61d0e3dc-a03d-4d3d-831b-70cd687d7198.png&amp;quot;,&amp;quot;image&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4298/optimized_61d0e3dc-a03d-4d3d-831b-70cd687d7198.png&amp;quot;,&amp;quot;caption&amp;quot;:&amp;quot;Austin’s current setup. (Image courtesy of Austin Tedesco.)&amp;quot;,&amp;quot;error&amp;quot;:null}"&gt;&lt;div&gt;&lt;a href="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4298/optimized_61d0e3dc-a03d-4d3d-831b-70cd687d7198.png" target="_blank" rel="noopener noreferrer"&gt;&lt;img src="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4298/optimized_61d0e3dc-a03d-4d3d-831b-70cd687d7198.png" alt="Austin’s current setup. (Image courtesy of Austin Tedesco.)"&gt;&lt;/a&gt;&lt;figcaption class="quill-image-caption"&gt;Austin’s current setup. (Image courtesy of Austin Tedesco.)&lt;/figcaption&gt;&lt;/div&gt;&lt;/div&gt;&lt;p&gt;&lt;br&gt;&lt;/p&gt;&lt;p&gt;&lt;strong&gt;&lt;a href="https://cora.computer" rel="noopener noreferrer" target="_blank"&gt;Cora&lt;/a&gt; general manager &lt;u&gt;&lt;a href="https://every.to/@kieran_1355" rel="noopener noreferrer" target="_blank"&gt;Kieran Klaassen&lt;/a&gt;&lt;/u&gt;’s breakdown: &lt;/strong&gt;The way Kieran likes to work—an &lt;u&gt;&lt;a href="https://every.to/context-window/you-re-the-bread-in-the-ai-sandwich" rel="noopener noreferrer" target="_blank"&gt;“AI sandwich”&lt;/a&gt;&lt;/u&gt; in which he sets the task, the machine executes, and he reviews the results—is the ideal setup for Fable 5. His process hasn’t changed, but Fable 5’s &lt;u&gt;&lt;a href="https://every.to/vibe-check/anthropic-mythos-our-fable-vibe-check" rel="noopener noreferrer" target="_blank"&gt;superior abilities&lt;/a&gt;&lt;/u&gt; on complex, multi-step assignments means the setup works a lot better than it used to.&lt;/p&gt;&lt;p&gt;Fable 5 has become Kieran’s default for the middle of the sandwich. For the “bread” stages, he usually works in &lt;u&gt;&lt;a href="https://every.to/vibe-check/cursor" rel="noopener noreferrer" target="_blank"&gt;Cursor&lt;/a&gt;&lt;/u&gt;, where he brainstorms and polishes. And for smaller independent tasks he can assign to an agent and review later, he uses &lt;u&gt;&lt;a href="https://every.to/p/how-to-use-codex-for-knowledge-work-a-power-user-s-guide" rel="noopener noreferrer" target="_blank"&gt;Codex&lt;/a&gt;&lt;/u&gt; CLI, &lt;u&gt;&lt;a href="https://every.to/source-code/claude-code-for-product-managers" rel="noopener noreferrer" target="_blank"&gt;Claude Code&lt;/a&gt;&lt;/u&gt; CLI, or Cursor managed agents.&lt;/p&gt;&lt;div class="quill-block-image" id="quill-block-image-1781190076296-xsght7hgo" data-source="{&amp;quot;dom_id&amp;quot;:&amp;quot;quill-block-image-1781190076296-xsght7hgo&amp;quot;,&amp;quot;link&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4298/optimized_204ac51f-a922-45c5-b31c-aee49cb513f3.png&amp;quot;,&amp;quot;image&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4298/optimized_204ac51f-a922-45c5-b31c-aee49cb513f3.png&amp;quot;,&amp;quot;caption&amp;quot;:&amp;quot;What made the Kieran cut. (Image courtesy of Kieran Klaassen.)&amp;quot;,&amp;quot;error&amp;quot;:null}"&gt;&lt;div&gt;&lt;a href="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4298/optimized_204ac51f-a922-45c5-b31c-aee49cb513f3.png" target="_blank" rel="noopener noreferrer"&gt;&lt;img src="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4298/optimized_204ac51f-a922-45c5-b31c-aee49cb513f3.png" alt="What made the Kieran cut. (Image courtesy of Kieran Klaassen.)"&gt;&lt;/a&gt;&lt;figcaption class="quill-image-caption"&gt;What made the Kieran cut. (Image courtesy of Kieran Klaassen.)&lt;/figcaption&gt;&lt;/div&gt;&lt;/div&gt;&lt;p&gt;&lt;br&gt;&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Head of platform &lt;u&gt;&lt;a href="https://every.to/@williewilliams" rel="noopener noreferrer" target="_blank"&gt;Willie Williams&lt;/a&gt;&lt;/u&gt;’s breakdown: &lt;/strong&gt;Willie is still working out his setup. Fable crushes other models on Every’s &lt;u&gt;&lt;a href="https://every.to/benchmarks/senior-engineer-benchmark" rel="noopener noreferrer" target="_blank"&gt;Senior Engineer benchmark&lt;/a&gt;&lt;/u&gt;, but it’s too slow and token-hungry to be a good collaborator. “Do I take the downside of a slightly less capable model, knowing that when we go to the iteration portion of the relationship, it’s more enjoyable to iterate with?”&lt;/p&gt;&lt;p&gt;For now, the Codex app is still where he does most of his daily work. He has spent a lot of time building his setup inside the app: “I can have one thread talk to another thread that talks to another thread—it makes for a nice workflow where I always know what’s going on.”&lt;/p&gt;&lt;p&gt;He plans to test Fable 5’s limits with tasks he’d give a senior engineer, such as reviewing a full codebase alongside a long list of product tickets and looking for an elegant fix that could solve several complaints at once.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Head of tech consulting &lt;u&gt;&lt;a href="https://every.to/@mike_2114" rel="noopener noreferrer" target="_blank"&gt;Mike Taylor&lt;/a&gt;&lt;/u&gt;’s breakdown: &lt;/strong&gt;The second there is a superior model, Mike reorganizes his workflow around it. Mike plans to put Fable 5 through its paces with tasks built around ambitious loops, such as having it write a technical book section by section from a table of contents, checking each section against editorial guidelines before continuing. “I will still use Codex, but mostly out of obligation that I should try all the different things,” he says. “If I weren’t working at a company where we need to have an opinion on these things, and thus need to try everything, I would probably just be using Fable.” (An AI early adopter, Mike is happy to shell out for access to the best new models—he already pays for his own Claude Max plan for personal projects.) &lt;/p&gt;&lt;p&gt;One giant caveat: Mike discovered, and alerted the rest of the consulting team, that Fable cannot be used for work done on behalf of the team’s clients. Consulting work often includes confidential information, and Fable’s model environment may retain context beyond a specific task, violating existing NDAs.&lt;/p&gt;&lt;h3&gt;&lt;strong&gt;Signal&lt;/strong&gt;&lt;/h3&gt;&lt;h4&gt;&lt;strong&gt;An Apple AI comeback?&lt;/strong&gt;&lt;/h4&gt;&lt;p&gt;For years, Apple has been an AI punching bag. Some of its most anticipated AI features never materialized, and Siri…sigh. &lt;/p&gt;&lt;p&gt;On the ground at this year’s Worldwide Developer’s Conference (WWDC), however, the vibes were looking up. After Apple partnered with Google to build a new model family, Siri is—wait for it—finally good, according to &lt;u&gt;&lt;a href="https://www.monologue.to/" rel="noopener noreferrer" target="_blank"&gt;Monologue&lt;/a&gt;&lt;/u&gt; general manager Naveen Naidu, who was on-site to test the beta version. He should know: Naveen is the one-man force behind Every’s voice dictation app, which runs on its own fine-tuned model. &lt;/p&gt;&lt;p&gt;&lt;strong&gt;What happened: &lt;/strong&gt;Apple showed off an improved on-device model that can handle simple tasks like setting an alarm. It impressed Naveen after 10 minutes of testing. More complex requests, like booking a flight, summarizing a long Slack thread, or searching through a user’s transcripts, can be routed through Private Cloud Compute, Apple’s privacy-focused cloud system for running larger AI models. The company is giving developers with fewer than 2 million downloads on their iOS app &lt;u&gt;&lt;a href="https://www.apple.com/newsroom/2026/06/apple-aids-app-development-with-new-intelligence-frameworks-and-advanced-tools/" rel="noopener noreferrer" target="_blank"&gt;free access&lt;/a&gt;&lt;/u&gt; to these models through its developer toolkit.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Why it matters: &lt;/strong&gt;Free access changes the math for most developers. If an AI feature runs on the user’s iPhone or Mac, the app maker doesn’t have to pay OpenAI, Anthropic, or Google every time someone uses it. That could make AI features possible for apps that haven’t been able to justify a monthly model bill. “People can start creating great experiences without worrying about costs,” Naveen says.&lt;/p&gt;&lt;p&gt;He’s excited to test it out for himself. Monologue monthly token fees would shrink if he moved some features to Apple models. “Obviously I need to test whether it’s fast enough and fits my constraints,” he says. “But if it’s free, I’m going to try it and see if I can build new experiences with it.”&lt;/p&gt;&lt;p&gt;Apple’s courting of developers appears to be working. Naveen talked to attendees who had been going to the conference for decades. The consensus? “Apple feels much more reachable,” he says.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;h3&gt;&lt;strong&gt;Tool spotlight&lt;/strong&gt;&lt;/h3&gt;&lt;h4&gt;&lt;strong&gt;Unicorn studio&lt;/strong&gt;&lt;/h4&gt;&lt;p&gt;If you think Every’s site looks cool—I’m biased, but it 100 percent does—a big reason for that is senior designer &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@daniel_5fbd21_1" rel="noopener noreferrer" target="_blank"&gt;Daniel Rodrigues&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;, the man behind the custom graphics that make Every pieces feel like experiences rather than static articles. &lt;/p&gt;&lt;p&gt;To make the animated hero images and interactive backgrounds that accompany each &lt;u&gt;&lt;a href="https://every.to/vibe-check" rel="noopener noreferrer" target="_blank"&gt;Vibe Check&lt;/a&gt;&lt;/u&gt; or ambitious features like Dan’s 8,000-plus-word &lt;u&gt;&lt;a href="https://every.to/p/after-automation" rel="noopener noreferrer" target="_blank"&gt;essay&lt;/a&gt;&lt;/u&gt; on why automation is a myth, Daniel turns to Unicorn Studio.&lt;/p&gt;&lt;p&gt;The &lt;u&gt;&lt;a href="https://www.unicorn.studio/" rel="noopener noreferrer" target="_blank"&gt;WebGL tool&lt;/a&gt;&lt;/u&gt; lets designers build animated, 3D-esque web graphics without having to write any code. For Every’s most recent Vibe Check on &lt;u&gt;&lt;a href="https://every.to/vibe-check/anthropic-mythos-our-fable-vibe-check" rel="noopener noreferrer" target="_blank"&gt;Anthropic’s Mythos-level model&lt;/a&gt;&lt;/u&gt;, Daniel’s directive was to create a “nebula type of vibe, like space.” &lt;/p&gt;&lt;p&gt;Unicorn Studio made it easy for him to nail the assignment: &lt;/p&gt;&lt;div class="quill-block-image" id="quill-block-image-1781192139106" data-source="{&amp;quot;dom_id&amp;quot;:&amp;quot;quill-block-image-1781192139106&amp;quot;,&amp;quot;link&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4298/optimized_1d8e7593-15b2-404e-b65c-3205bdf774d2.gif&amp;quot;,&amp;quot;image&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4298/optimized_1d8e7593-15b2-404e-b65c-3205bdf774d2.gif&amp;quot;,&amp;quot;caption&amp;quot;:&amp;quot;An image that Daniel made for the Fable 5 launch. (Images courtesy of Daniel Rodrigues.)&amp;quot;,&amp;quot;error&amp;quot;:null}"&gt;&lt;div&gt;&lt;a href="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4298/optimized_1d8e7593-15b2-404e-b65c-3205bdf774d2.gif" target="_blank" rel="noopener noreferrer"&gt;&lt;img src="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4298/optimized_1d8e7593-15b2-404e-b65c-3205bdf774d2.gif" alt="An image that Daniel made for the Fable 5 launch. (Images courtesy of Daniel Rodrigues.)"&gt;&lt;/a&gt;&lt;figcaption class="quill-image-caption"&gt;An image that Daniel made for the Fable 5 launch. (Images courtesy of Daniel Rodrigues.)&lt;/figcaption&gt;&lt;/div&gt;&lt;/div&gt;&lt;p&gt;&lt;br&gt;&lt;/p&gt;&lt;p&gt;Here are a few of his other recent creations using the tool:&lt;/p&gt;&lt;div class="quill-block-image" id="quill-block-image-1781194995290" data-source="{&amp;quot;dom_id&amp;quot;:&amp;quot;quill-block-image-1781194995290&amp;quot;,&amp;quot;link&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4298/optimized_34c1a413-0eda-4874-acf4-3fc14665c4ee.gif&amp;quot;,&amp;quot;image&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4298/optimized_34c1a413-0eda-4874-acf4-3fc14665c4ee.gif&amp;quot;,&amp;quot;caption&amp;quot;:&amp;quot;A cheese-y image Daniel made for the Opus 4.8 launch.&amp;quot;,&amp;quot;error&amp;quot;:null}"&gt;&lt;div&gt;&lt;a href="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4298/optimized_34c1a413-0eda-4874-acf4-3fc14665c4ee.gif" target="_blank" rel="noopener noreferrer"&gt;&lt;img src="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4298/optimized_34c1a413-0eda-4874-acf4-3fc14665c4ee.gif" alt="A cheese-y image Daniel made for the Opus 4.8 launch."&gt;&lt;/a&gt;&lt;figcaption class="quill-image-caption"&gt;A cheese-y image Daniel made for the Opus 4.8 launch.&lt;/figcaption&gt;&lt;/div&gt;&lt;/div&gt;&lt;div class="quill-block-image" id="quill-block-image-1781195024385" data-source="{&amp;quot;dom_id&amp;quot;:&amp;quot;quill-block-image-1781195024385&amp;quot;,&amp;quot;link&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4298/optimized_e85ed590-1503-4b3a-ad23-814221c47249.gif&amp;quot;,&amp;quot;image&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4298/optimized_e85ed590-1503-4b3a-ad23-814221c47249.gif&amp;quot;,&amp;quot;caption&amp;quot;:&amp;quot;An image Daniel made for an article comparing Claude and Codex.&amp;quot;,&amp;quot;error&amp;quot;:null}"&gt;&lt;div&gt;&lt;a href="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4298/optimized_e85ed590-1503-4b3a-ad23-814221c47249.gif" target="_blank" rel="noopener noreferrer"&gt;&lt;img src="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4298/optimized_e85ed590-1503-4b3a-ad23-814221c47249.gif" alt="An image Daniel made for an article comparing Claude and Codex."&gt;&lt;/a&gt;&lt;figcaption class="quill-image-caption"&gt;An image Daniel made for an article comparing Claude and Codex.&lt;/figcaption&gt;&lt;/div&gt;&lt;/div&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;h3&gt;&lt;strong&gt;Jagged frontier&lt;/strong&gt;&lt;/h3&gt;&lt;h4&gt;&lt;strong&gt;LLMs supply options, I supply vibes&lt;/strong&gt;&lt;/h4&gt;&lt;p&gt;I asked &lt;u&gt;&lt;a href="https://writewithspiral.com/" rel="noopener noreferrer" target="_blank"&gt;Spiral&lt;/a&gt;&lt;/u&gt;, our AI writing assistant, for 20 opening lines to an article. Number 16 was the one. I couldn’t tell you why. I just knew.&lt;/p&gt;&lt;p&gt;There’s no doubt that AI is capable of generating genius. It can build towers of code and write whole drafts before I’ve finished my coffee. There’s no limit to a model’s ability to execute, but it can’t tell whether any of it is any good.&lt;/p&gt;&lt;p&gt;And that ability to judge is vitally important. Truly creative work is an act of jumping from point A to point D, with no explanation except that it will vibe with other humans. &lt;/p&gt;&lt;p&gt;What are vibes? Vibes are our ability to lock in and resonate with another person’s energy; they let us intuit what matters right now, at this exact cultural moment, drawn from a lifetime of being a person in the world, thinking as humans think. And because we can feel them, we can predict if other humans will feel them too.&lt;/p&gt;&lt;p&gt;LLMs don’t think the way we think; they can’t respond to our energy. Can’t resonate, can’t vibe. And without the ability to vibe, they are blind, capable of producing great outputs along with mediocre ones but incapable of recognizing the difference.&lt;/p&gt;&lt;p&gt;So today my arrangement with AI is simple: It supplies options, I supply vibes. We work together. But while it can mine the training set for solutions, the vibes aren’t in there—they’re in me.—&lt;em&gt;&lt;u&gt;&lt;a href="https://every.to/@williewilliams" rel="noopener noreferrer" target="_blank"&gt;Willie Williams&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;strong&gt;&lt;em&gt;&lt;a href="https://every.to/@laura_27bbaf_1" rel="noopener noreferrer" target="_blank"&gt;Laura Entis&lt;/a&gt;&lt;/em&gt;&lt;/strong&gt;&lt;em&gt; is a staff writer at Every. You can follow her on &lt;a href="https://www.linkedin.com/in/lauraentis/" rel="noopener noreferrer" target="_blank"&gt;LinkedIn&lt;/a&gt;. To read more essays like this, subscribe to &lt;u&gt;&lt;a href="https://every.to/subscribe" rel="noopener noreferrer" target="_blank"&gt;Every&lt;/a&gt;&lt;/u&gt;, and follow us on X at &lt;u&gt;&lt;a href="http://twitter.com/every" rel="noopener noreferrer" target="_blank"&gt;@every&lt;/a&gt;&lt;/u&gt; and on &lt;u&gt;&lt;a href="https://www.linkedin.com/company/everyinc/" rel="noopener noreferrer" target="_blank"&gt;LinkedIn&lt;/a&gt;&lt;/u&gt;.&lt;/em&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;For sponsorship opportunities, reach out to sponsorships@every.to.&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;Help us scale the only subscription you need to stay at the edge of AI. Explore &lt;u&gt;&lt;a href="https://www.notion.so/Jobs-Every-25cca4f355ac80c5ad6ee7a6e93d6b4e?pvs=21" rel="noopener noreferrer" target="_blank"&gt;open roles at Every&lt;/a&gt;&lt;/u&gt;.&lt;/em&gt;&lt;/p&gt;&lt;div class="quill-button" data-source="{&amp;quot;id&amp;quot;:&amp;quot;quill-button-1769187301610&amp;quot;,&amp;quot;url&amp;quot;:&amp;quot;https://every.to/subscribe?source=post_button&amp;quot;,&amp;quot;text&amp;quot;:&amp;quot;Subscribe&amp;quot;}" id="quill-button-1769187301610"&gt;&lt;a href="https://every.to/subscribe?source=post_button"&gt;Subscribe&lt;/a&gt;&lt;/div&gt;</description>
      <author>Laura Entis / Context Window</author>
      <pubDate>2026-06-11 10:00:00 -0400</pubDate>
      <guid>https://every.to/context-window/ai-everywhere-all-at-once</guid>
      <link>https://every.to/context-window/ai-everywhere-all-at-once</link>
    </item>
    <item>
      <title>How to Get the Most Out of Fable 5</title>
      <description>&lt;table&gt;&lt;tr&gt;&lt;td&gt;&lt;img alt="Context Window" src="https://d24ovhgu8s7341.cloudfront.net/uploads/publication/logo/94/small_context_windown_1.png" /&gt;&lt;/td&gt;&lt;td&gt;&lt;/td&gt;&lt;td&gt;&lt;table&gt;&lt;tr&gt;&lt;td&gt;by &lt;a href="https://every.to/@laura_27bbaf_1" itemprop="name"&gt;Laura Entis&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;in &lt;a href="https://every.to/context-window"&gt;Context Window&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;figure&gt;&lt;img src="https://d24ovhgu8s7341.cloudfront.net/uploads/post/cover/4294/full_page_cover_0d11d15b4bbe2ba2-circle_cover.png"&gt;&lt;figcaption&gt;Midjourney/Every illustration.&lt;/figcaption&gt;&lt;/figure&gt;&lt;p&gt;&lt;em&gt;We’re hosting &lt;u&gt;&lt;a href="http://every.to/events" rel="noopener noreferrer" target="_blank"&gt;two live camps&lt;/a&gt;&lt;/u&gt; for paid Every members to put the latest frontier tools to work: &lt;/em&gt;&lt;strong&gt;&lt;em&gt;&lt;u&gt;&lt;a href="https://every.to/events/fable-5-power-user-camp" rel="noopener noreferrer" target="_blank"&gt;Fable 5 Camp&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;&lt;/strong&gt;&lt;em&gt; this Friday, June 12, followed by a rescheduled &lt;/em&gt;&lt;strong&gt;&lt;em&gt;Codex for Power Users Camp&lt;/em&gt;&lt;/strong&gt;&lt;em&gt; on Friday, June 26. If you already registered for this Friday’s camp, your seat is saved for the Fable deep dive, and &lt;u&gt;&lt;a href="http://every.to/events" rel="noopener noreferrer" target="_blank"&gt;you can RSVP for the Codex Camp&lt;/a&gt;&lt;/u&gt;.&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;h2&gt;&lt;strong&gt;‘AI &amp;amp; I’: &lt;/strong&gt;How Anthropic uses Claude Fable 5 with Mike Krieger&lt;/h2&gt;&lt;p&gt;Today, we’re releasing a new episode of our podcast &lt;em&gt;&lt;u&gt;&lt;a href="https://every.to/podcast" rel="noopener noreferrer" target="_blank"&gt;AI &amp;amp; I&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;. &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@danshipper" rel="noopener noreferrer" target="_blank"&gt;Dan Shipper&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; sits down with &lt;strong&gt;Mike Krieger&lt;/strong&gt;, the cofounder of Instagram and head of Anthropic Labs, to discuss what it feels like to build with &lt;u&gt;&lt;a href="https://every.to/vibe-check/anthropic-mythos-our-fable-vibe-check" rel="noopener noreferrer" target="_blank"&gt;Fable 5&lt;/a&gt;&lt;/u&gt;, a model powerful enough that it’s forcing him to rethink the very definition of productivity, engineering, and creative agency.&lt;/p&gt;&lt;p&gt;As someone who built one of the most popular consumer apps in the pre-GPT era and has had access to Fable 5 for months, Krieger has a rare vantage point on what the radical compression of the product development arc means for builders. &lt;/p&gt;&lt;p&gt;Watch on &lt;strong&gt;&lt;u&gt;&lt;a href="https://x.com/danshipper/status/2064761654789681281" rel="noopener noreferrer" target="_blank"&gt;X&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; or &lt;strong&gt;&lt;u&gt;&lt;a href="https://www.youtube.com/watch?v=XWpTgCvgYaE" rel="noopener noreferrer" target="_blank"&gt;YouTube&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;, or listen on &lt;strong&gt;&lt;u&gt;&lt;a href="https://open.spotify.com/episode/7s1VcIHp1q6PG9hofb2fVY?si=DsAlKVymRs2-J0cnM25M6w&amp;amp;nd=1&amp;amp;dlsi=4383c06a09314ba1" rel="noopener noreferrer" target="_blank"&gt;Spotify&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; or &lt;strong&gt;&lt;a href="https://podcasts.apple.com/us/podcast/how-anthropic-uses-claude-fable-5-with-mike-krieger/id1719789201?i=1000772067637" rel="noopener noreferrer" target="_blank"&gt;Apple Podcasts&lt;/a&gt;&lt;/strong&gt;. You can also read the &lt;strong&gt;&lt;a href="https://every.to/podcast/transcript-how-anthropic-uses-claude-fable-5-with-mike-krieger" rel="noopener noreferrer" target="_blank"&gt;transcript&lt;/a&gt;&lt;/strong&gt;.&lt;/p&gt;&lt;p&gt;Here are the highlights:&lt;/p&gt;&lt;ol&gt;&lt;li&gt;&lt;strong&gt;More work is happening overnight.&lt;/strong&gt; Fable 5 is the first model capable enough that you can hand it a complex task, walk away, and trust it will be completed by morning. When it hits an obstacle—a remote service goes down, say, or a tool stops working—it writes a workaround and forges ahead. That resilience has changed the daily rhythm of Krieger’s work: He now ends his workday by briefing the model on what needs to get done while he sleeps, rather than sitting down to do it himself.&lt;/li&gt;&lt;li&gt;&lt;strong&gt;The gap between what’s in your head and what exists in the world is closing.&lt;/strong&gt; Given access to Fable 5 and a set of internal MCPs, an Anthropic recruiter described the experience as, “The first time in my life where I feel like the thing that’s in my head and the thing that exists in the world are right next to each other. I can just do it.” &lt;em&gt;This&lt;/em&gt; is the most meaningful thing about the new model class, Krieger says—it allows non-engineers to create the exact products they need to get more done.&lt;/li&gt;&lt;li&gt;&lt;strong&gt;Software engineering is dead. Long live software engineering.&lt;/strong&gt; Engineers now spend less time writing code and more time setting direction, reviewing what their AI agents have built, and making judgment calls when something breaks in production. The divide between product managers and engineers has blurred. “There is a feeling of loss, I think, in some of the better engineers that I talk to, as well as the feeling of, ‘Oh my God, but I can do insane amounts of work now at the same time.’ We’re holding both ideas in our heads at once,” Krieger says.&lt;/li&gt;&lt;li&gt;&lt;strong&gt;All eyes are on verification.&lt;/strong&gt; If we can delegate more to the model, it becomes more important to check what it has built works in practice. Krieger’s approach combines regression testing on known workflows, visual checks—including giving the model video captures of its own work so it can catch animation glitches screenshots would miss—and mock backends for anything too complex to test live. When a bug arrives via Slack, Fable 5 makes the fix, posts the pull request, then follows up hours later.&lt;/li&gt;&lt;/ol&gt;&lt;p&gt;Miss an episode? Catch up on Dan’s recent conversations with LinkedIn cofounder &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/podcast/reid-hoffman-makes-five-predictions-about-ai-in-2026" rel="noopener noreferrer" target="_blank"&gt;Reid Hoffman&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;; the team that built Claude Code, &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/podcast/how-to-use-claude-code-like-the-people-who-built-it" rel="noopener noreferrer" target="_blank"&gt;Cat Wu&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; and &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/podcast/how-to-use-claude-code-like-the-people-who-built-it" rel="noopener noreferrer" target="_blank"&gt;Boris Cherny&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;; Vercel cofounder &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/podcast/vercel-s-guillermo-rauch-on-what-comes-after-coding" rel="noopener noreferrer" target="_blank"&gt;Guillermo Rauch&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;; podcaster &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/podcast/dwarkesh-patel-s-quest-to-learn-everything" rel="noopener noreferrer" target="_blank"&gt;Dwarkesh Patel&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;; and others, and learn how they use AI to think, create, and relate.&lt;/p&gt;&lt;h3&gt;&lt;strong&gt;How the Every team is using Fable 5&lt;/strong&gt;&lt;/h3&gt;&lt;p&gt;The easiest way to be disappointed by Fable 5 is to use it as if it were &lt;u&gt;&lt;a href="https://every.to/vibe-check/gpt-5-5" rel="noopener noreferrer" target="_blank"&gt;GPT-5.5&lt;/a&gt;&lt;/u&gt; or &lt;u&gt;&lt;a href="https://every.to/context-window/opus-4-8-is-smart-enough-to-get-in-your-way" rel="noopener noreferrer" target="_blank"&gt;Opus 4.8&lt;/a&gt;&lt;/u&gt;, smart models that require specific instructions and careful prompting for the best results.&lt;/p&gt;&lt;p&gt;Instead, Fable 5 feels like working with a capable coworker—at least that’s Every’s consensus &lt;u&gt;&lt;a href="https://every.to/vibe-check/anthropic-mythos-our-fable-vibe-check" rel="noopener noreferrer" target="_blank"&gt;after a week of testing&lt;/a&gt;&lt;/u&gt;. &lt;/p&gt;&lt;p&gt;“It feels like you have an engineer on your team that you just gave a problem to, and they’ll figure it out,” says &lt;strong&gt;&lt;u&gt;&lt;a href="https://cora.computer/" rel="noopener noreferrer" target="_blank"&gt;Cora&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; general manager &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@kieran_1355" rel="noopener noreferrer" target="_blank"&gt;Kieran Klaassen&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;. &lt;/p&gt;&lt;p&gt;That means, to get the most out of Anthropic’s &lt;u&gt;&lt;a href="https://every.to/vibe-check/anthropic-mythos-our-fable-vibe-check" rel="noopener noreferrer" target="_blank"&gt;first Mythos-class model&lt;/a&gt;&lt;/u&gt; available to the public, you have to &lt;u&gt;&lt;a href="https://every.to/chain-of-thought/the-knowledge-economy-is-over-welcome-to-the-allocation-economy" rel="noopener noreferrer" target="_blank"&gt;think like a manager&lt;/a&gt;&lt;/u&gt;: Equip the model with context, goals, and a way to verify the work, then step aside. It may even stumble on a solution you hadn’t considered.&lt;/p&gt;&lt;p&gt;Not every task deserves this treatment. Smart colleagues don’t come cheap, and neither does Fable 5. Here’s how to get the most out of this powerful new model and some of the workflows the team is using. &lt;/p&gt;&lt;h4&gt;&lt;strong&gt;Pick the right tasks &lt;/strong&gt;&lt;/h4&gt;&lt;p&gt;Tasks that are good candidates for Fable 5 have four qualities: You’re able to give the model organized and deep context, a well-defined goal, and a clear definition of what good or done looks like, and the importance of the task justifies the cost.&lt;/p&gt;&lt;p&gt;The model is smart enough to reason its way through complex problems and likes to carry tasks through to the end, but if your data is wrong or out of date, or your goals conflict, it will likely reach the wrong conclusion. That’s less of a concern on earlier, less powerful models, where you’re giving feedback more frequently during a task and could catch those mistakes. &lt;/p&gt;&lt;p&gt;Advanced users of AI—who operate at Level 7 or Level 8 on our &lt;u&gt;&lt;a href="https://every.to/guides/the-eight-levels-of-ai-adoption" rel="noopener noreferrer" target="_blank"&gt;AI adoption curve&lt;/a&gt;&lt;/u&gt;—are already comfortable delegating to their agents. For everyone else, using the model demands a mental reframing. Instead of iterating back and forth, the work gets frontloaded into providing the right context and establishing clear directives, letting Fable 5 do its thing, and only reviewing the results once it’s completed the entire task. The examples below are entry points to get you started.&lt;/p&gt;&lt;h4&gt;&lt;strong&gt;Example 1: Fix a broken workflow&lt;/strong&gt;&lt;/h4&gt;&lt;p&gt;Senior engineer &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@nityesh" rel="noopener noreferrer" target="_blank"&gt;Nityesh Agarwal&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; built a Claude Code skill to help Every’s consulting team create first drafts of PowerPoint decks. It worked, but it kept hitting the same snags: Boxes were slightly misaligned, images weren’t the right size, and sometimes a footer would be updated on one slide but not another. One run with Claude Code took about 30 minutes, used roughly 100 million tokens, and still came back with errors.&lt;/p&gt;&lt;p&gt;Nityesh pointed Fable 5 at the Claude Code session log and asked it to review where the PowerPoint skill was breaking down.&lt;/p&gt;&lt;p&gt;Fable 5 found the root problem. Under the hood, a PowerPoint file is a bundle of XML files that store the position, size, styling, and order of everything on a slide. Claude was being asked to edit those hidden files directly, so a simple request like “change this phrase” or “move this image two inches left” required the model to find the right hidden text and rewrite the surrounding layout code without disturbing anything else.&lt;/p&gt;&lt;p&gt;Fable 5 built a command-line tool that gives agents a more natural way to work with PowerPoint— if the text on a specific slide needs to be updated, for example, or an image has to be resized, the agent can use the tool to make these targeted changes instead of having to rewrite the entire XML file.   &lt;/p&gt;&lt;p&gt;&lt;strong&gt;Nityesh’s takeaway: Use Fable 5 to diagnose broken workflows, create the tools or skills that fix them, and then let cheaper models use that infrastructure going forward.&lt;/strong&gt;&lt;/p&gt;&lt;div class="quill-prompt-snippet prompt-snippet" id="quill-prompt-snippet-1781113199996" data-prompt-snippet="" data-source="{&amp;quot;id&amp;quot;:&amp;quot;quill-prompt-snippet-1781113199996&amp;quot;,&amp;quot;label&amp;quot;:&amp;quot;Nityesh's prompt&amp;quot;,&amp;quot;text&amp;quot;:&amp;quot;Here is a session log from an agent trying to complete this workflow: [describe workflow]. It struggled in these ways: [time, cost, errors, bad outputs, repeated failures]. Take a step back and analyze where the current tool, skill, or workflow is breaking down. What is the root cause of the failure, and how would you fix it? Make a plan first. Then build or specify the upgrade. Test it against the same kind of task, and explain how cheaper models could use it later.&amp;quot;,&amp;quot;show_claude&amp;quot;:true,&amp;quot;show_chatgpt&amp;quot;:true,&amp;quot;show_gemini&amp;quot;:true,&amp;quot;show_copy&amp;quot;:true}"&gt;
      &lt;div class="prompt-snippet-header"&gt;
        &lt;span class="prompt-snippet-title"&gt;Nityesh's prompt&lt;/span&gt;
        &lt;div class="prompt-snippet-actions"&gt;&lt;button class="prompt-snippet-btn" aria-label="Open in Gemini" data-tip="Open in Gemini" data-ai="gemini"&gt;&lt;svg width="18" height="18" viewBox="0 0 28 28" fill="none" xmlns="http://www.w3.org/2000/svg"&gt;&lt;path d="M14 28C14 26.0633 13.6267 24.2433 12.88 22.54C12.1567 20.8367 11.165 19.355 9.905 18.095C8.645 16.835 7.16333 15.8433 5.46 15.12C3.75667 14.3733 1.93667 14 0 14C1.93667 14 3.75667 13.6383 5.46 12.915C7.16333 12.1683 8.645 11.165 9.905 9.905C11.165 8.645 12.1567 7.16333 12.88 5.46C13.6267 3.75667 14 1.93667 14 0C14 1.93667 14.3617 3.75667 15.085 5.46C15.8317 7.16333 16.835 8.645 18.095 9.905C19.355 11.165 20.8367 12.1683 22.54 12.915C24.2433 13.6383 26.0633 14 28 14C26.0633 14 24.2433 14.3733 22.54 15.12C20.8367 15.8433 19.355 16.835 18.095 18.095C16.835 19.355 15.8317 20.8367 15.085 22.54C14.3617 24.2433 14 26.0633 14 28Z" fill="url(#ps-gem-quill-prompt-snippet-1781113199996)"&gt;&lt;/path&gt;&lt;defs&gt;&lt;linearGradient id="ps-gem-quill-prompt-snippet-1781113199996" x1="0" y1="0" x2="28" y2="28" gradientUnits="userSpaceOnUse"&gt;&lt;stop stop-color="#1C69FF"&gt;&lt;/stop&gt;&lt;stop offset="1" stop-color="#9747FF"&gt;&lt;/stop&gt;&lt;/linearGradient&gt;&lt;/defs&gt;&lt;/svg&gt;&lt;/button&gt;&lt;button class="prompt-snippet-btn" aria-label="Open in ChatGPT" data-tip="Open in ChatGPT" data-ai="chatgpt"&gt;&lt;svg width="18" height="18" viewBox="0 0 24 24" xmlns="http://www.w3.org/2000/svg" fill="currentColor"&gt;&lt;path d="M22.2819 9.8211a5.9847 5.9847 0 0 0-.5157-4.9108 6.0462 6.0462 0 0 0-6.5098-2.9A6.0651 6.0651 0 0 0 4.9807 4.1818a5.9847 5.9847 0 0 0-3.9977 2.9 6.0462 6.0462 0 0 0 .7427 7.0966 5.98 5.98 0 0 0 .511 4.9107 6.051 6.051 0 0 0 6.5146 2.9001A5.9847 5.9847 0 0 0 13.2599 24a6.0557 6.0557 0 0 0 5.7718-4.2058 5.9894 5.9894 0 0 0 3.9977-2.9001 6.0557 6.0557 0 0 0-.7475-7.0729zm-9.022 12.6081a4.4755 4.4755 0 0 1-2.8764-1.0408l.1419-.0804 4.7783-2.7582a.7948.7948 0 0 0 .3927-.6813v-6.7369l2.02 1.1686a.071.071 0 0 1 .038.052v5.5826a4.504 4.504 0 0 1-4.4945 4.4944zm-9.6607-4.1254a4.4708 4.4708 0 0 1-.5346-3.0137l.142.0852 4.783 2.7582a.7712.7712 0 0 0 .7806 0l5.8428-3.3685v2.3324a.0804.0804 0 0 1-.0332.0615L9.74 19.9502a4.4992 4.4992 0 0 1-6.1408-1.6464zM2.3408 7.8956a4.485 4.485 0 0 1 2.3655-1.9728V11.6a.7664.7664 0 0 0 .3879.6765l5.8144 3.3543-2.0201 1.1685a.0757.0757 0 0 1-.071 0l-4.8303-2.7865A4.504 4.504 0 0 1 2.3408 7.872zm16.5963 3.8558L13.1038 8.364 15.1192 7.2a.0757.0757 0 0 1 .071 0l4.8303 2.7913a4.4944 4.4944 0 0 1-.6765 8.1042v-5.6772a.79.79 0 0 0-.407-.667zm2.0107-3.0231l-.142-.0852-4.7735-2.7818a.7759.7759 0 0 0-.7854 0L9.409 9.2297V6.8974a.0662.0662 0 0 1 .0284-.0615l4.8303-2.7866a4.4992 4.4992 0 0 1 6.6802 4.66zM8.3065 12.863l-2.02-1.1638a.0804.0804 0 0 1-.038-.0567V6.0742a4.4992 4.4992 0 0 1 7.3757-3.4537l-.142.0805L8.704 5.459a.7948.7948 0 0 0-.3927.6813zm1.0976-2.3654l2.602-1.4998 2.6069 1.4998v2.9994l-2.5974 1.4997-2.6067-1.4997Z"&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/button&gt;&lt;button class="prompt-snippet-btn" aria-label="Open in Claude" data-tip="Open in Claude" data-ai="claude"&gt;&lt;svg height="18" width="18" viewBox="0 0 24 24" xmlns="http://www.w3.org/2000/svg"&gt;&lt;path d="M4.709 15.955l4.72-2.647.08-.23-.08-.128H9.2l-.79-.048-2.698-.073-2.339-.097-2.266-.122-.571-.121L0 11.784l.055-.352.48-.321.686.06 1.52.103 2.278.158 1.652.097 2.449.255h.389l.055-.157-.134-.098-.103-.097-2.358-1.596-2.552-1.688-1.336-.972-.724-.491-.364-.462-.158-1.008.656-.722.881.06.225.061.893.686 1.908 1.476 2.491 1.833.365.304.145-.103.019-.073-.164-.274-1.355-2.446-1.446-2.49-.644-1.032-.17-.619a2.97 2.97 0 01-.104-.729L6.283.134 6.696 0l.996.134.42.364.62 1.414 1.002 2.229 1.555 3.03.456.898.243.832.091.255h.158V9.01l.128-1.706.237-2.095.23-2.695.08-.76.376-.91.747-.492.584.28.48.685-.067.444-.286 1.851-.559 2.903-.364 1.942h.212l.243-.242.985-1.306 1.652-2.064.73-.82.85-.904.547-.431h1.033l.76 1.129-.34 1.166-1.064 1.347-.881 1.142-1.264 1.7-.79 1.36.073.11.188-.02 2.856-.606 1.543-.28 1.841-.315.833.388.091.395-.328.807-1.969.486-2.309.462-3.439.813-.042.03.049.061 1.549.146.662.036h1.622l3.02.225.79.522.474.638-.079.485-1.215.62-1.64-.389-3.829-.91-1.312-.329h-.182v.11l1.093 1.068 2.006 1.81 2.509 2.33.127.578-.322.455-.34-.049-2.205-1.657-.851-.747-1.926-1.62h-.128v.17l.444.649 2.345 3.521.122 1.08-.17.353-.608.213-.668-.122-1.374-1.925-1.415-2.167-1.143-1.943-.14.08-.674 7.254-.316.37-.729.28-.607-.461-.322-.747.322-1.476.389-1.924.315-1.53.286-1.9.17-.632-.012-.042-.14.018-1.434 1.967-2.18 2.945-1.726 1.845-.414.164-.717-.37.067-.662.401-.589 2.388-3.036 1.44-1.882.93-1.086-.006-.158h-.055L4.132 18.56l-1.13.146-.487-.456.061-.746.231-.243 1.908-1.312-.006.006z" fill="#D97757" fill-rule="nonzero"&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/button&gt;&lt;button class="prompt-snippet-btn" aria-label="Copy prompt" data-tip="Copy prompt" data-copy-prompt=""&gt;&lt;svg width="16" height="16" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"&gt;&lt;rect x="9" y="9" width="13" height="13" rx="2" ry="2"&gt;&lt;/rect&gt;&lt;path d="M5 15H4a2 2 0 0 1-2-2V4a2 2 0 0 1 2-2h9a2 2 0 0 1 2 2v1"&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/button&gt;&lt;/div&gt;
      &lt;/div&gt;
      &lt;div class="prompt-snippet-body"&gt;
        &lt;div class="prompt-snippet-text" data-prompt-text=""&gt;&lt;p&gt;Here is a session log from an agent trying to complete this workflow: [describe workflow]. It struggled in these ways: [time, cost, errors, bad outputs, repeated failures]. Take a step back and analyze where the current tool, skill, or workflow is breaking down. What is the root cause of the failure, and how would you fix it? Make a plan first. Then build or specify the upgrade. Test it against the same kind of task, and explain how cheaper models could use it later.&lt;/p&gt;&lt;/div&gt;
      &lt;/div&gt;
    &lt;/div&gt;&lt;h4&gt;&lt;strong&gt;Example 2: Create a go-to-market strategy&lt;/strong&gt;&lt;/h4&gt;&lt;p&gt;&lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/on-every/austin-tedesco-joins-every-as-head-of-growth" rel="noopener noreferrer" target="_blank"&gt;Austin Tedesco&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;, Every’s head of growth, had a lackluster first experience using Fable 5 to generate a go-to-market strategy for this week’s model launch. He asked it to look across Slack and Notion for relevant context, but the output didn’t feel meaningfully better than what he might have gotten from GPT-5.5 or Opus 4.8. “It did a good job of synthesizing what we had all said and compiling it. But I thought, ‘This plan isn’t any better than what we would have done,’” he says. “It’s acting as a very expensive, verbose executive assistant.”&lt;/p&gt;&lt;p&gt;For his next attempt, Austin put more effort into framing a more complex problem, asking Fable to look at a large set of Every audience insights—survey results, PostHog data, brand positioning work—and audit them against the actual &lt;u&gt;&lt;a href="http://every.to" rel="noopener noreferrer" target="_blank"&gt;Every.to&lt;/a&gt;&lt;/u&gt; website experience, while considering the team’s quarterly plans and internal goals. Then he gave it a clear business objective—increase paid subscriptions among those target customer profiles.&lt;/p&gt;&lt;p&gt;He was also more specific in what he wanted back: a Notion report with 10 data insights that might change how the team operated, plus a stack-ranked list of 10 things Every should ship or try.&lt;/p&gt;&lt;p&gt;The difference was striking. “&lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@danshipper" rel="noopener noreferrer" target="_blank"&gt;Dan [Shipper]&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; and I kept saying, ‘This is nuts,’” Austin says. “If we hired a go-to-market engineer to do this and they turned this around in two weeks, we would say, ‘This was an incredible hire.’”&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Austin’s takeaway: Fable 5 performs much better on knowledge work when you give it a complex problem, full access to relevant sources of truth, a clear goal, and a specific output.&lt;/strong&gt; If you ask it to “make a plan,” it may summarize what people already agree on. If you ask it to use data to test assumptions and produce a ranked set of decisions, the results are much sharper.&lt;/p&gt;&lt;div class="quill-prompt-snippet prompt-snippet" id="quill-prompt-snippet-1781113177564" data-prompt-snippet="" data-source="{&amp;quot;id&amp;quot;:&amp;quot;quill-prompt-snippet-1781113177564&amp;quot;,&amp;quot;label&amp;quot;:&amp;quot;Austin's prompt&amp;quot;,&amp;quot;text&amp;quot;:&amp;quot;Use the attached source pack to analyze [business area/launch/audience/funnel].\nSources include: [survey data, customer research, analytics dashboards, website context,\nplanning docs, meeting notes, Slack discussions, internal goals].\nOur goal is [specific business goal] for [target customer/profile].\nDo not just summarize internal consensus. Use the data and source material to test our\nassumptions and identify what should change.\nProduce:\n1. The 10 most important insights that could change how we operate.\n2. A stack-ranked list of 10 things we should ship, try, or stop doing.\n3. The evidence behind each recommendation.\n4. Any source conflicts, stale rules, unclear analytics definitions, or assumptions\n   I should verify before acting.\nIf you find a conclusion that depends heavily on one data source or project rule,\nflag it and explain how you would check whether it is true.\n/lfg—This command sends the agent on a full compound engineering workflow, including planning, building and reviewing. It’s a reliable way to get the most out of Fable.&amp;quot;,&amp;quot;show_claude&amp;quot;:true,&amp;quot;show_chatgpt&amp;quot;:true,&amp;quot;show_gemini&amp;quot;:true,&amp;quot;show_copy&amp;quot;:true}"&gt;
      &lt;div class="prompt-snippet-header"&gt;
        &lt;span class="prompt-snippet-title"&gt;Austin's prompt&lt;/span&gt;
        &lt;div class="prompt-snippet-actions"&gt;&lt;button class="prompt-snippet-btn" aria-label="Open in Gemini" data-tip="Open in Gemini" data-ai="gemini"&gt;&lt;svg width="18" height="18" viewBox="0 0 28 28" fill="none" xmlns="http://www.w3.org/2000/svg"&gt;&lt;path d="M14 28C14 26.0633 13.6267 24.2433 12.88 22.54C12.1567 20.8367 11.165 19.355 9.905 18.095C8.645 16.835 7.16333 15.8433 5.46 15.12C3.75667 14.3733 1.93667 14 0 14C1.93667 14 3.75667 13.6383 5.46 12.915C7.16333 12.1683 8.645 11.165 9.905 9.905C11.165 8.645 12.1567 7.16333 12.88 5.46C13.6267 3.75667 14 1.93667 14 0C14 1.93667 14.3617 3.75667 15.085 5.46C15.8317 7.16333 16.835 8.645 18.095 9.905C19.355 11.165 20.8367 12.1683 22.54 12.915C24.2433 13.6383 26.0633 14 28 14C26.0633 14 24.2433 14.3733 22.54 15.12C20.8367 15.8433 19.355 16.835 18.095 18.095C16.835 19.355 15.8317 20.8367 15.085 22.54C14.3617 24.2433 14 26.0633 14 28Z" fill="url(#ps-gem-quill-prompt-snippet-1781113177564)"&gt;&lt;/path&gt;&lt;defs&gt;&lt;linearGradient id="ps-gem-quill-prompt-snippet-1781113177564" x1="0" y1="0" x2="28" y2="28" gradientUnits="userSpaceOnUse"&gt;&lt;stop stop-color="#1C69FF"&gt;&lt;/stop&gt;&lt;stop offset="1" stop-color="#9747FF"&gt;&lt;/stop&gt;&lt;/linearGradient&gt;&lt;/defs&gt;&lt;/svg&gt;&lt;/button&gt;&lt;button class="prompt-snippet-btn" aria-label="Open in ChatGPT" data-tip="Open in ChatGPT" data-ai="chatgpt"&gt;&lt;svg width="18" height="18" viewBox="0 0 24 24" xmlns="http://www.w3.org/2000/svg" fill="currentColor"&gt;&lt;path d="M22.2819 9.8211a5.9847 5.9847 0 0 0-.5157-4.9108 6.0462 6.0462 0 0 0-6.5098-2.9A6.0651 6.0651 0 0 0 4.9807 4.1818a5.9847 5.9847 0 0 0-3.9977 2.9 6.0462 6.0462 0 0 0 .7427 7.0966 5.98 5.98 0 0 0 .511 4.9107 6.051 6.051 0 0 0 6.5146 2.9001A5.9847 5.9847 0 0 0 13.2599 24a6.0557 6.0557 0 0 0 5.7718-4.2058 5.9894 5.9894 0 0 0 3.9977-2.9001 6.0557 6.0557 0 0 0-.7475-7.0729zm-9.022 12.6081a4.4755 4.4755 0 0 1-2.8764-1.0408l.1419-.0804 4.7783-2.7582a.7948.7948 0 0 0 .3927-.6813v-6.7369l2.02 1.1686a.071.071 0 0 1 .038.052v5.5826a4.504 4.504 0 0 1-4.4945 4.4944zm-9.6607-4.1254a4.4708 4.4708 0 0 1-.5346-3.0137l.142.0852 4.783 2.7582a.7712.7712 0 0 0 .7806 0l5.8428-3.3685v2.3324a.0804.0804 0 0 1-.0332.0615L9.74 19.9502a4.4992 4.4992 0 0 1-6.1408-1.6464zM2.3408 7.8956a4.485 4.485 0 0 1 2.3655-1.9728V11.6a.7664.7664 0 0 0 .3879.6765l5.8144 3.3543-2.0201 1.1685a.0757.0757 0 0 1-.071 0l-4.8303-2.7865A4.504 4.504 0 0 1 2.3408 7.872zm16.5963 3.8558L13.1038 8.364 15.1192 7.2a.0757.0757 0 0 1 .071 0l4.8303 2.7913a4.4944 4.4944 0 0 1-.6765 8.1042v-5.6772a.79.79 0 0 0-.407-.667zm2.0107-3.0231l-.142-.0852-4.7735-2.7818a.7759.7759 0 0 0-.7854 0L9.409 9.2297V6.8974a.0662.0662 0 0 1 .0284-.0615l4.8303-2.7866a4.4992 4.4992 0 0 1 6.6802 4.66zM8.3065 12.863l-2.02-1.1638a.0804.0804 0 0 1-.038-.0567V6.0742a4.4992 4.4992 0 0 1 7.3757-3.4537l-.142.0805L8.704 5.459a.7948.7948 0 0 0-.3927.6813zm1.0976-2.3654l2.602-1.4998 2.6069 1.4998v2.9994l-2.5974 1.4997-2.6067-1.4997Z"&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/button&gt;&lt;button class="prompt-snippet-btn" aria-label="Open in Claude" data-tip="Open in Claude" data-ai="claude"&gt;&lt;svg height="18" width="18" viewBox="0 0 24 24" xmlns="http://www.w3.org/2000/svg"&gt;&lt;path d="M4.709 15.955l4.72-2.647.08-.23-.08-.128H9.2l-.79-.048-2.698-.073-2.339-.097-2.266-.122-.571-.121L0 11.784l.055-.352.48-.321.686.06 1.52.103 2.278.158 1.652.097 2.449.255h.389l.055-.157-.134-.098-.103-.097-2.358-1.596-2.552-1.688-1.336-.972-.724-.491-.364-.462-.158-1.008.656-.722.881.06.225.061.893.686 1.908 1.476 2.491 1.833.365.304.145-.103.019-.073-.164-.274-1.355-2.446-1.446-2.49-.644-1.032-.17-.619a2.97 2.97 0 01-.104-.729L6.283.134 6.696 0l.996.134.42.364.62 1.414 1.002 2.229 1.555 3.03.456.898.243.832.091.255h.158V9.01l.128-1.706.237-2.095.23-2.695.08-.76.376-.91.747-.492.584.28.48.685-.067.444-.286 1.851-.559 2.903-.364 1.942h.212l.243-.242.985-1.306 1.652-2.064.73-.82.85-.904.547-.431h1.033l.76 1.129-.34 1.166-1.064 1.347-.881 1.142-1.264 1.7-.79 1.36.073.11.188-.02 2.856-.606 1.543-.28 1.841-.315.833.388.091.395-.328.807-1.969.486-2.309.462-3.439.813-.042.03.049.061 1.549.146.662.036h1.622l3.02.225.79.522.474.638-.079.485-1.215.62-1.64-.389-3.829-.91-1.312-.329h-.182v.11l1.093 1.068 2.006 1.81 2.509 2.33.127.578-.322.455-.34-.049-2.205-1.657-.851-.747-1.926-1.62h-.128v.17l.444.649 2.345 3.521.122 1.08-.17.353-.608.213-.668-.122-1.374-1.925-1.415-2.167-1.143-1.943-.14.08-.674 7.254-.316.37-.729.28-.607-.461-.322-.747.322-1.476.389-1.924.315-1.53.286-1.9.17-.632-.012-.042-.14.018-1.434 1.967-2.18 2.945-1.726 1.845-.414.164-.717-.37.067-.662.401-.589 2.388-3.036 1.44-1.882.93-1.086-.006-.158h-.055L4.132 18.56l-1.13.146-.487-.456.061-.746.231-.243 1.908-1.312-.006.006z" fill="#D97757" fill-rule="nonzero"&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/button&gt;&lt;button class="prompt-snippet-btn" aria-label="Copy prompt" data-tip="Copy prompt" data-copy-prompt=""&gt;&lt;svg width="16" height="16" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"&gt;&lt;rect x="9" y="9" width="13" height="13" rx="2" ry="2"&gt;&lt;/rect&gt;&lt;path d="M5 15H4a2 2 0 0 1-2-2V4a2 2 0 0 1 2-2h9a2 2 0 0 1 2 2v1"&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/button&gt;&lt;/div&gt;
      &lt;/div&gt;
      &lt;div class="prompt-snippet-body"&gt;
        &lt;div class="prompt-snippet-text" data-prompt-text=""&gt;&lt;p&gt;Use the attached source pack to analyze [business area/launch/audience/funnel].&lt;br&gt;Sources include: [survey data, customer research, analytics dashboards, website context,&lt;br&gt;planning docs, meeting notes, Slack discussions, internal goals].&lt;br&gt;Our goal is [specific business goal] for [target customer/profile].&lt;br&gt;Do not just summarize internal consensus. Use the data and source material to test our&lt;br&gt;assumptions and identify what should change.&lt;br&gt;Produce:&lt;br&gt;1. The 10 most important insights that could change how we operate.&lt;br&gt;2. A stack-ranked list of 10 things we should ship, try, or stop doing.&lt;br&gt;3. The evidence behind each recommendation.&lt;br&gt;4. Any source conflicts, stale rules, unclear analytics definitions, or assumptions&lt;br&gt;   I should verify before acting.&lt;br&gt;If you find a conclusion that depends heavily on one data source or project rule,&lt;br&gt;flag it and explain how you would check whether it is true.&lt;br&gt;/lfg—This command sends the agent on a full compound engineering workflow, including planning, building and reviewing. It’s a reliable way to get the most out of Fable.&lt;/p&gt;&lt;/div&gt;
      &lt;/div&gt;
    &lt;/div&gt;&lt;h4&gt;&lt;strong&gt;Example 3: Turn feedback into batched changes&lt;/strong&gt;&lt;/h4&gt;&lt;p&gt;Before Fable 5, Kieran trusted agents with narrow, well-defined product fixes, such as making a keyboard shortcut work or resolving a bug from a screen recording. His broader workflow was &lt;u&gt;&lt;a href="https://x.com/kieranklaassen/status/2064457735845171518" rel="noopener noreferrer" target="_blank"&gt;already in place&lt;/a&gt;&lt;/u&gt;: Pull feedback from Slack or videos, give it to an agent, and review the result at the end. He calls it the &lt;u&gt;&lt;a href="https://every.to/context-window/you-re-the-bread-in-the-ai-sandwich" rel="noopener noreferrer" target="_blank"&gt;“AI sandwich”&lt;/a&gt;&lt;/u&gt;: human at the start, machine in the middle, human at the end. &lt;/p&gt;&lt;p&gt;This weekend, Kieran had Fable 5 pull everything a colleague had said about Cora over the previous two days on Slack, analyze it, and make a list of product fixes. His agent handled all the fixes, and Kieran checked the result at the end. In one run, he made 30 fixes in a single batch and had the agent check the changes didn’t interfere with one another, instead of reviewing 10 small tasks one by one.&lt;/p&gt;&lt;p&gt;Now he’s working on building the next layer: having the agent automatically pull product feedback from Slack on a schedule, evaluate it against Cora’s vision document and user personas, surface changes that seem to be worth making, and present it to him for approval. &lt;/p&gt;&lt;p&gt;That kind of loop depends on the quality of the raw material. Screen recordings are useful because they show the model what a written bug report often leaves out, such as what someone clicked on, what happened next, and how that differed from what they expected. Slack can also work as a source of signal, especially when the comments come from people with strong product judgment.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Kieran’s takeaway: Fable 5 is strongest when the work is connected to a feedback loop. &lt;/strong&gt;The model can gather, group, and act on feedback, but the quality of the result still depends on the quality of the input. His role is to decide which ideas are worth acting on. &lt;/p&gt;&lt;div class="quill-prompt-snippet prompt-snippet" id="quill-prompt-snippet-1781113153578" data-prompt-snippet="" data-source="{&amp;quot;id&amp;quot;:&amp;quot;quill-prompt-snippet-1781113153578&amp;quot;,&amp;quot;label&amp;quot;:&amp;quot;Kieran's Prompt&amp;quot;,&amp;quot;text&amp;quot;:&amp;quot;Collect product feedback about [product/feature/workflow] from these sources:\n[Slack channel, support tickets, screen recordings, screenshots, production logs,\ncustomer calls, meeting notes].\nGroup the feedback into themes. Identify:\n1. What is clearly actionable\n2. What needs my judgment before acting\n3. What conflicts with our strategy, personas, or product direction\n4. What evidence you used\nFor the actionable items, create one batch plan. If you have the tools and approval\nto make the changes, implement them together. Make sure the fixes do not conflict.\nWhen you are done, show me what changed, what you skipped, what still needs my\nreview, and how you verified the work.&amp;quot;,&amp;quot;show_claude&amp;quot;:true,&amp;quot;show_chatgpt&amp;quot;:true,&amp;quot;show_gemini&amp;quot;:true,&amp;quot;show_copy&amp;quot;:true}"&gt;
      &lt;div class="prompt-snippet-header"&gt;
        &lt;span class="prompt-snippet-title"&gt;Kieran's Prompt&lt;/span&gt;
        &lt;div class="prompt-snippet-actions"&gt;&lt;button class="prompt-snippet-btn" aria-label="Open in Gemini" data-tip="Open in Gemini" data-ai="gemini"&gt;&lt;svg width="18" height="18" viewBox="0 0 28 28" fill="none" xmlns="http://www.w3.org/2000/svg"&gt;&lt;path d="M14 28C14 26.0633 13.6267 24.2433 12.88 22.54C12.1567 20.8367 11.165 19.355 9.905 18.095C8.645 16.835 7.16333 15.8433 5.46 15.12C3.75667 14.3733 1.93667 14 0 14C1.93667 14 3.75667 13.6383 5.46 12.915C7.16333 12.1683 8.645 11.165 9.905 9.905C11.165 8.645 12.1567 7.16333 12.88 5.46C13.6267 3.75667 14 1.93667 14 0C14 1.93667 14.3617 3.75667 15.085 5.46C15.8317 7.16333 16.835 8.645 18.095 9.905C19.355 11.165 20.8367 12.1683 22.54 12.915C24.2433 13.6383 26.0633 14 28 14C26.0633 14 24.2433 14.3733 22.54 15.12C20.8367 15.8433 19.355 16.835 18.095 18.095C16.835 19.355 15.8317 20.8367 15.085 22.54C14.3617 24.2433 14 26.0633 14 28Z" fill="url(#ps-gem-quill-prompt-snippet-1781113153578)"&gt;&lt;/path&gt;&lt;defs&gt;&lt;linearGradient id="ps-gem-quill-prompt-snippet-1781113153578" x1="0" y1="0" x2="28" y2="28" gradientUnits="userSpaceOnUse"&gt;&lt;stop stop-color="#1C69FF"&gt;&lt;/stop&gt;&lt;stop offset="1" stop-color="#9747FF"&gt;&lt;/stop&gt;&lt;/linearGradient&gt;&lt;/defs&gt;&lt;/svg&gt;&lt;/button&gt;&lt;button class="prompt-snippet-btn" aria-label="Open in ChatGPT" data-tip="Open in ChatGPT" data-ai="chatgpt"&gt;&lt;svg width="18" height="18" viewBox="0 0 24 24" xmlns="http://www.w3.org/2000/svg" fill="currentColor"&gt;&lt;path d="M22.2819 9.8211a5.9847 5.9847 0 0 0-.5157-4.9108 6.0462 6.0462 0 0 0-6.5098-2.9A6.0651 6.0651 0 0 0 4.9807 4.1818a5.9847 5.9847 0 0 0-3.9977 2.9 6.0462 6.0462 0 0 0 .7427 7.0966 5.98 5.98 0 0 0 .511 4.9107 6.051 6.051 0 0 0 6.5146 2.9001A5.9847 5.9847 0 0 0 13.2599 24a6.0557 6.0557 0 0 0 5.7718-4.2058 5.9894 5.9894 0 0 0 3.9977-2.9001 6.0557 6.0557 0 0 0-.7475-7.0729zm-9.022 12.6081a4.4755 4.4755 0 0 1-2.8764-1.0408l.1419-.0804 4.7783-2.7582a.7948.7948 0 0 0 .3927-.6813v-6.7369l2.02 1.1686a.071.071 0 0 1 .038.052v5.5826a4.504 4.504 0 0 1-4.4945 4.4944zm-9.6607-4.1254a4.4708 4.4708 0 0 1-.5346-3.0137l.142.0852 4.783 2.7582a.7712.7712 0 0 0 .7806 0l5.8428-3.3685v2.3324a.0804.0804 0 0 1-.0332.0615L9.74 19.9502a4.4992 4.4992 0 0 1-6.1408-1.6464zM2.3408 7.8956a4.485 4.485 0 0 1 2.3655-1.9728V11.6a.7664.7664 0 0 0 .3879.6765l5.8144 3.3543-2.0201 1.1685a.0757.0757 0 0 1-.071 0l-4.8303-2.7865A4.504 4.504 0 0 1 2.3408 7.872zm16.5963 3.8558L13.1038 8.364 15.1192 7.2a.0757.0757 0 0 1 .071 0l4.8303 2.7913a4.4944 4.4944 0 0 1-.6765 8.1042v-5.6772a.79.79 0 0 0-.407-.667zm2.0107-3.0231l-.142-.0852-4.7735-2.7818a.7759.7759 0 0 0-.7854 0L9.409 9.2297V6.8974a.0662.0662 0 0 1 .0284-.0615l4.8303-2.7866a4.4992 4.4992 0 0 1 6.6802 4.66zM8.3065 12.863l-2.02-1.1638a.0804.0804 0 0 1-.038-.0567V6.0742a4.4992 4.4992 0 0 1 7.3757-3.4537l-.142.0805L8.704 5.459a.7948.7948 0 0 0-.3927.6813zm1.0976-2.3654l2.602-1.4998 2.6069 1.4998v2.9994l-2.5974 1.4997-2.6067-1.4997Z"&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/button&gt;&lt;button class="prompt-snippet-btn" aria-label="Open in Claude" data-tip="Open in Claude" data-ai="claude"&gt;&lt;svg height="18" width="18" viewBox="0 0 24 24" xmlns="http://www.w3.org/2000/svg"&gt;&lt;path d="M4.709 15.955l4.72-2.647.08-.23-.08-.128H9.2l-.79-.048-2.698-.073-2.339-.097-2.266-.122-.571-.121L0 11.784l.055-.352.48-.321.686.06 1.52.103 2.278.158 1.652.097 2.449.255h.389l.055-.157-.134-.098-.103-.097-2.358-1.596-2.552-1.688-1.336-.972-.724-.491-.364-.462-.158-1.008.656-.722.881.06.225.061.893.686 1.908 1.476 2.491 1.833.365.304.145-.103.019-.073-.164-.274-1.355-2.446-1.446-2.49-.644-1.032-.17-.619a2.97 2.97 0 01-.104-.729L6.283.134 6.696 0l.996.134.42.364.62 1.414 1.002 2.229 1.555 3.03.456.898.243.832.091.255h.158V9.01l.128-1.706.237-2.095.23-2.695.08-.76.376-.91.747-.492.584.28.48.685-.067.444-.286 1.851-.559 2.903-.364 1.942h.212l.243-.242.985-1.306 1.652-2.064.73-.82.85-.904.547-.431h1.033l.76 1.129-.34 1.166-1.064 1.347-.881 1.142-1.264 1.7-.79 1.36.073.11.188-.02 2.856-.606 1.543-.28 1.841-.315.833.388.091.395-.328.807-1.969.486-2.309.462-3.439.813-.042.03.049.061 1.549.146.662.036h1.622l3.02.225.79.522.474.638-.079.485-1.215.62-1.64-.389-3.829-.91-1.312-.329h-.182v.11l1.093 1.068 2.006 1.81 2.509 2.33.127.578-.322.455-.34-.049-2.205-1.657-.851-.747-1.926-1.62h-.128v.17l.444.649 2.345 3.521.122 1.08-.17.353-.608.213-.668-.122-1.374-1.925-1.415-2.167-1.143-1.943-.14.08-.674 7.254-.316.37-.729.28-.607-.461-.322-.747.322-1.476.389-1.924.315-1.53.286-1.9.17-.632-.012-.042-.14.018-1.434 1.967-2.18 2.945-1.726 1.845-.414.164-.717-.37.067-.662.401-.589 2.388-3.036 1.44-1.882.93-1.086-.006-.158h-.055L4.132 18.56l-1.13.146-.487-.456.061-.746.231-.243 1.908-1.312-.006.006z" fill="#D97757" fill-rule="nonzero"&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/button&gt;&lt;button class="prompt-snippet-btn" aria-label="Copy prompt" data-tip="Copy prompt" data-copy-prompt=""&gt;&lt;svg width="16" height="16" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"&gt;&lt;rect x="9" y="9" width="13" height="13" rx="2" ry="2"&gt;&lt;/rect&gt;&lt;path d="M5 15H4a2 2 0 0 1-2-2V4a2 2 0 0 1 2-2h9a2 2 0 0 1 2 2v1"&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/button&gt;&lt;/div&gt;
      &lt;/div&gt;
      &lt;div class="prompt-snippet-body"&gt;
        &lt;div class="prompt-snippet-text" data-prompt-text=""&gt;&lt;p&gt;Collect product feedback about [product/feature/workflow] from these sources:&lt;br&gt;[Slack channel, support tickets, screen recordings, screenshots, production logs,&lt;br&gt;customer calls, meeting notes].&lt;br&gt;Group the feedback into themes. Identify:&lt;br&gt;1. What is clearly actionable&lt;br&gt;2. What needs my judgment before acting&lt;br&gt;3. What conflicts with our strategy, personas, or product direction&lt;br&gt;4. What evidence you used&lt;br&gt;For the actionable items, create one batch plan. If you have the tools and approval&lt;br&gt;to make the changes, implement them together. Make sure the fixes do not conflict.&lt;br&gt;When you are done, show me what changed, what you skipped, what still needs my&lt;br&gt;review, and how you verified the work.&lt;/p&gt;&lt;/div&gt;
      &lt;/div&gt;
    &lt;/div&gt;&lt;h4&gt;&lt;strong&gt;Example 4: Build from an original plan&lt;/strong&gt;&lt;/h4&gt;&lt;p&gt;Years ago, &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@williewilliams" rel="noopener noreferrer" target="_blank"&gt;Willie Williams&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;, Every’s head of platform, built a website for a friend who was in school to become a therapist and needed a better way to create genograms, a kind of annotated family tree used in some therapeutic intake processes. The original version took him a couple of weeks, and it had a stubborn bug: While running, the site would get slower and slower until it crashed.&lt;/p&gt;&lt;p&gt;Willie had tried to fix the issue with other models before, to no avail. Fable 5 didn’t get it on the first try either. When Willie asked it to only look at the code, it confidently suggested a fix that didn’t work. Then, he told Fable 5 to run the app locally, watch what was happening, and figure it out from there. Once it could see the site running, it fixed the bug.&lt;/p&gt;&lt;p&gt;After that, Willie gave Fable 5 a bigger assignment: Build the product from scratch using the original spec, without looking at the version he had already built. The spec included who the product was for, what users needed to create, how the visual workspace should behave, and the edge cases the app needed to handle. Willie ran the same test against Opus 4.8, GPT-5.5, Fable 5. Fable 5 was significantly better than Opus or GPT-5.5.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Willie’s takeaway: Fable 5 is well suited to assignments where you can give it the same materials you would give a senior engineer&lt;/strong&gt;: the product goal, relevant domain context, tricky edge cases, and a clear sense of what the first version needs to do. It can work from a plan, make judgment calls, and produce a first version without being walked through every step.&lt;/p&gt;&lt;div class="quill-prompt-snippet prompt-snippet" id="quill-prompt-snippet-1781113126263" data-prompt-snippet="" data-source="{&amp;quot;id&amp;quot;:&amp;quot;quill-prompt-snippet-1781113126263&amp;quot;,&amp;quot;label&amp;quot;:&amp;quot;Willie's prompt&amp;quot;,&amp;quot;text&amp;quot;:&amp;quot;I want you to build a first working version of [product/website/tool].\nHere is the original product spec: [paste spec]\nHere is who it is for: [users]\nHere is the domain context you need: [terms, workflows, examples, constraints]\nHere are the tricky cases the product must handle: [edge cases]\nHere is what the first version needs to do: [requirements]\nHere is what can be rough for now: [scope boundaries]\nMake a plan first. Then build the first version. Run it locally or otherwise test\nit in the environment where it will actually be used. When you are done, show me\nhow to try it, what decisions you made, what you left out, and what I should\nreview carefully.&amp;quot;,&amp;quot;show_claude&amp;quot;:true,&amp;quot;show_chatgpt&amp;quot;:true,&amp;quot;show_gemini&amp;quot;:true,&amp;quot;show_copy&amp;quot;:true}"&gt;
      &lt;div class="prompt-snippet-header"&gt;
        &lt;span class="prompt-snippet-title"&gt;Willie's prompt&lt;/span&gt;
        &lt;div class="prompt-snippet-actions"&gt;&lt;button class="prompt-snippet-btn" aria-label="Open in Gemini" data-tip="Open in Gemini" data-ai="gemini"&gt;&lt;svg width="18" height="18" viewBox="0 0 28 28" fill="none" xmlns="http://www.w3.org/2000/svg"&gt;&lt;path d="M14 28C14 26.0633 13.6267 24.2433 12.88 22.54C12.1567 20.8367 11.165 19.355 9.905 18.095C8.645 16.835 7.16333 15.8433 5.46 15.12C3.75667 14.3733 1.93667 14 0 14C1.93667 14 3.75667 13.6383 5.46 12.915C7.16333 12.1683 8.645 11.165 9.905 9.905C11.165 8.645 12.1567 7.16333 12.88 5.46C13.6267 3.75667 14 1.93667 14 0C14 1.93667 14.3617 3.75667 15.085 5.46C15.8317 7.16333 16.835 8.645 18.095 9.905C19.355 11.165 20.8367 12.1683 22.54 12.915C24.2433 13.6383 26.0633 14 28 14C26.0633 14 24.2433 14.3733 22.54 15.12C20.8367 15.8433 19.355 16.835 18.095 18.095C16.835 19.355 15.8317 20.8367 15.085 22.54C14.3617 24.2433 14 26.0633 14 28Z" fill="url(#ps-gem-quill-prompt-snippet-1781113126263)"&gt;&lt;/path&gt;&lt;defs&gt;&lt;linearGradient id="ps-gem-quill-prompt-snippet-1781113126263" x1="0" y1="0" x2="28" y2="28" gradientUnits="userSpaceOnUse"&gt;&lt;stop stop-color="#1C69FF"&gt;&lt;/stop&gt;&lt;stop offset="1" stop-color="#9747FF"&gt;&lt;/stop&gt;&lt;/linearGradient&gt;&lt;/defs&gt;&lt;/svg&gt;&lt;/button&gt;&lt;button class="prompt-snippet-btn" aria-label="Open in ChatGPT" data-tip="Open in ChatGPT" data-ai="chatgpt"&gt;&lt;svg width="18" height="18" viewBox="0 0 24 24" xmlns="http://www.w3.org/2000/svg" fill="currentColor"&gt;&lt;path d="M22.2819 9.8211a5.9847 5.9847 0 0 0-.5157-4.9108 6.0462 6.0462 0 0 0-6.5098-2.9A6.0651 6.0651 0 0 0 4.9807 4.1818a5.9847 5.9847 0 0 0-3.9977 2.9 6.0462 6.0462 0 0 0 .7427 7.0966 5.98 5.98 0 0 0 .511 4.9107 6.051 6.051 0 0 0 6.5146 2.9001A5.9847 5.9847 0 0 0 13.2599 24a6.0557 6.0557 0 0 0 5.7718-4.2058 5.9894 5.9894 0 0 0 3.9977-2.9001 6.0557 6.0557 0 0 0-.7475-7.0729zm-9.022 12.6081a4.4755 4.4755 0 0 1-2.8764-1.0408l.1419-.0804 4.7783-2.7582a.7948.7948 0 0 0 .3927-.6813v-6.7369l2.02 1.1686a.071.071 0 0 1 .038.052v5.5826a4.504 4.504 0 0 1-4.4945 4.4944zm-9.6607-4.1254a4.4708 4.4708 0 0 1-.5346-3.0137l.142.0852 4.783 2.7582a.7712.7712 0 0 0 .7806 0l5.8428-3.3685v2.3324a.0804.0804 0 0 1-.0332.0615L9.74 19.9502a4.4992 4.4992 0 0 1-6.1408-1.6464zM2.3408 7.8956a4.485 4.485 0 0 1 2.3655-1.9728V11.6a.7664.7664 0 0 0 .3879.6765l5.8144 3.3543-2.0201 1.1685a.0757.0757 0 0 1-.071 0l-4.8303-2.7865A4.504 4.504 0 0 1 2.3408 7.872zm16.5963 3.8558L13.1038 8.364 15.1192 7.2a.0757.0757 0 0 1 .071 0l4.8303 2.7913a4.4944 4.4944 0 0 1-.6765 8.1042v-5.6772a.79.79 0 0 0-.407-.667zm2.0107-3.0231l-.142-.0852-4.7735-2.7818a.7759.7759 0 0 0-.7854 0L9.409 9.2297V6.8974a.0662.0662 0 0 1 .0284-.0615l4.8303-2.7866a4.4992 4.4992 0 0 1 6.6802 4.66zM8.3065 12.863l-2.02-1.1638a.0804.0804 0 0 1-.038-.0567V6.0742a4.4992 4.4992 0 0 1 7.3757-3.4537l-.142.0805L8.704 5.459a.7948.7948 0 0 0-.3927.6813zm1.0976-2.3654l2.602-1.4998 2.6069 1.4998v2.9994l-2.5974 1.4997-2.6067-1.4997Z"&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/button&gt;&lt;button class="prompt-snippet-btn" aria-label="Open in Claude" data-tip="Open in Claude" data-ai="claude"&gt;&lt;svg height="18" width="18" viewBox="0 0 24 24" xmlns="http://www.w3.org/2000/svg"&gt;&lt;path d="M4.709 15.955l4.72-2.647.08-.23-.08-.128H9.2l-.79-.048-2.698-.073-2.339-.097-2.266-.122-.571-.121L0 11.784l.055-.352.48-.321.686.06 1.52.103 2.278.158 1.652.097 2.449.255h.389l.055-.157-.134-.098-.103-.097-2.358-1.596-2.552-1.688-1.336-.972-.724-.491-.364-.462-.158-1.008.656-.722.881.06.225.061.893.686 1.908 1.476 2.491 1.833.365.304.145-.103.019-.073-.164-.274-1.355-2.446-1.446-2.49-.644-1.032-.17-.619a2.97 2.97 0 01-.104-.729L6.283.134 6.696 0l.996.134.42.364.62 1.414 1.002 2.229 1.555 3.03.456.898.243.832.091.255h.158V9.01l.128-1.706.237-2.095.23-2.695.08-.76.376-.91.747-.492.584.28.48.685-.067.444-.286 1.851-.559 2.903-.364 1.942h.212l.243-.242.985-1.306 1.652-2.064.73-.82.85-.904.547-.431h1.033l.76 1.129-.34 1.166-1.064 1.347-.881 1.142-1.264 1.7-.79 1.36.073.11.188-.02 2.856-.606 1.543-.28 1.841-.315.833.388.091.395-.328.807-1.969.486-2.309.462-3.439.813-.042.03.049.061 1.549.146.662.036h1.622l3.02.225.79.522.474.638-.079.485-1.215.62-1.64-.389-3.829-.91-1.312-.329h-.182v.11l1.093 1.068 2.006 1.81 2.509 2.33.127.578-.322.455-.34-.049-2.205-1.657-.851-.747-1.926-1.62h-.128v.17l.444.649 2.345 3.521.122 1.08-.17.353-.608.213-.668-.122-1.374-1.925-1.415-2.167-1.143-1.943-.14.08-.674 7.254-.316.37-.729.28-.607-.461-.322-.747.322-1.476.389-1.924.315-1.53.286-1.9.17-.632-.012-.042-.14.018-1.434 1.967-2.18 2.945-1.726 1.845-.414.164-.717-.37.067-.662.401-.589 2.388-3.036 1.44-1.882.93-1.086-.006-.158h-.055L4.132 18.56l-1.13.146-.487-.456.061-.746.231-.243 1.908-1.312-.006.006z" fill="#D97757" fill-rule="nonzero"&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/button&gt;&lt;button class="prompt-snippet-btn" aria-label="Copy prompt" data-tip="Copy prompt" data-copy-prompt=""&gt;&lt;svg width="16" height="16" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"&gt;&lt;rect x="9" y="9" width="13" height="13" rx="2" ry="2"&gt;&lt;/rect&gt;&lt;path d="M5 15H4a2 2 0 0 1-2-2V4a2 2 0 0 1 2-2h9a2 2 0 0 1 2 2v1"&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/button&gt;&lt;/div&gt;
      &lt;/div&gt;
      &lt;div class="prompt-snippet-body"&gt;
        &lt;div class="prompt-snippet-text" data-prompt-text=""&gt;&lt;p&gt;I want you to build a first working version of [product/website/tool].&lt;br&gt;Here is the original product spec: [paste spec]&lt;br&gt;Here is who it is for: [users]&lt;br&gt;Here is the domain context you need: [terms, workflows, examples, constraints]&lt;br&gt;Here are the tricky cases the product must handle: [edge cases]&lt;br&gt;Here is what the first version needs to do: [requirements]&lt;br&gt;Here is what can be rough for now: [scope boundaries]&lt;br&gt;Make a plan first. Then build the first version. Run it locally or otherwise test&lt;br&gt;it in the environment where it will actually be used. When you are done, show me&lt;br&gt;how to try it, what decisions you made, what you left out, and what I should&lt;br&gt;review carefully.&lt;/p&gt;&lt;/div&gt;
      &lt;/div&gt;
    &lt;/div&gt;&lt;h4&gt;&lt;strong&gt;Cost is only part of the decision&lt;/strong&gt;&lt;/h4&gt;&lt;p&gt;Fable 5 is powerful—and expensive. The model is available on Claude’s paid plans until June 22, after which it will move to token-based pricing. (It costs $10 per million input tokens and $50 per million output tokens, making it roughly two times the cost of Opus 4.8 and three times the cost of &lt;u&gt;&lt;a href="https://every.to/vibe-check/vibe-check-anthropic-just-made-opus-cheaper-without-calling-it-that" rel="noopener noreferrer" target="_blank"&gt;Sonnet 4.6&lt;/a&gt;&lt;/u&gt;.) &lt;/p&gt;&lt;p&gt;But cost is only one part of the decision. Fable 5 is also slow, especially when you run it at higher effort levels, so it makes the most sense for large, complex, delegable jobs you can trust the model to complete and review later, such as fixing a broken workflow, building a feature or app, synthesizing a lot of source material, or reviewing a codebase. For quick edits, small bugs, and back-and-forth brainstorming, faster and cheaper models remain the better option. Read more about the model’s strengths and weaknesses in &lt;u&gt;&lt;a href="https://every.to/vibe-check/anthropic-mythos-our-fable-vibe-check" rel="noopener noreferrer" target="_blank"&gt;our hands-on Vibe Check&lt;/a&gt;&lt;/u&gt;. &lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;&lt;strong&gt;&lt;em&gt;&lt;a href="https://every.to/@laura_27bbaf_1" rel="noopener noreferrer" target="_blank"&gt;Laura Entis&lt;/a&gt;&lt;/em&gt;&lt;/strong&gt;&lt;em&gt; is a staff writer at Every. You can follow her on &lt;a href="https://www.linkedin.com/in/lauraentis/" rel="noopener noreferrer" target="_blank"&gt;LinkedIn&lt;/a&gt;. To read more essays like this, subscribe to &lt;u&gt;&lt;a href="https://every.to/subscribe" rel="noopener noreferrer" target="_blank"&gt;Every&lt;/a&gt;&lt;/u&gt;, and follow us on X at &lt;u&gt;&lt;a href="http://twitter.com/every" rel="noopener noreferrer" target="_blank"&gt;@every&lt;/a&gt;&lt;/u&gt; and on &lt;u&gt;&lt;a href="https://www.linkedin.com/company/everyinc/" rel="noopener noreferrer" target="_blank"&gt;LinkedIn&lt;/a&gt;&lt;/u&gt;.&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;For sponsorship opportunities, reach out to sponsorships@every.to.&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;Help us scale the only subscription you need to stay at the edge of AI. Explore &lt;u&gt;&lt;a href="https://www.notion.so/Jobs-Every-25cca4f355ac80c5ad6ee7a6e93d6b4e?pvs=21" rel="noopener noreferrer" target="_blank"&gt;open roles at Every&lt;/a&gt;&lt;/u&gt;.&lt;/em&gt;&lt;/p&gt;&lt;div class="quill-button" data-source="{&amp;quot;id&amp;quot;:&amp;quot;quill-button-1769187301610&amp;quot;,&amp;quot;url&amp;quot;:&amp;quot;https://every.to/subscribe?source=post_button&amp;quot;,&amp;quot;text&amp;quot;:&amp;quot;Subscribe&amp;quot;}" id="quill-button-1769187301610"&gt;&lt;a href="https://every.to/subscribe?source=post_button"&gt;Subscribe&lt;/a&gt;&lt;/div&gt;</description>
      <author>Laura Entis / Context Window</author>
      <pubDate>2026-06-10 17:00:00 -0400</pubDate>
      <guid>https://every.to/context-window/how-to-get-the-most-out-of-fable-5</guid>
      <link>https://every.to/context-window/how-to-get-the-most-out-of-fable-5</link>
    </item>
    <item>
      <title>My Editor Caught Me Sounding Like AI. Now AI Catches Me First.</title>
      <description>&lt;table&gt;&lt;tr&gt;&lt;td&gt;&lt;img alt="Working Overtime" src="https://d24ovhgu8s7341.cloudfront.net/uploads/publication/logo/100/small_Screenshot_2024-11-22_at_9.33.36_AM.png" /&gt;&lt;/td&gt;&lt;td&gt;&lt;/td&gt;&lt;td&gt;&lt;table&gt;&lt;tr&gt;&lt;td&gt;by &lt;a href="https://every.to/@katie.parrott12" itemprop="name"&gt;Katie Parrott&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;in &lt;a href="https://every.to/working-overtime"&gt;Working Overtime&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;figure&gt;&lt;img src="https://d24ovhgu8s7341.cloudfront.net/uploads/post/cover/4270/full_page_cover_7496745ef501ef76-Monday_s_piece_1.png"&gt;&lt;figcaption&gt;Midjourney/Every illustration. &lt;/figcaption&gt;&lt;/figure&gt;&lt;p&gt;&lt;em&gt;Was this newsletter forwarded to you? &lt;u&gt;&lt;a href="https://every.to/account" rel="noopener noreferrer" target="_blank"&gt;Sign up&lt;/a&gt;&lt;/u&gt; to get it in your inbox.&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;Before a recent one-on-one with &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@kate_1767" rel="noopener noreferrer" target="_blank"&gt;Kate Lee&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;, Every’s editor in chief, I opened our shared document and found a list of my own writing fails staring back at me. My drafts had picked up too many of the AI tells that both I—and you—know how to spot from across the room: the symmetrical sentence structures, the little rhetorical throat-clears, the phrases that sound profound on first pass but on closer inspection contain nothing but air, and those pesky sets of three. &lt;/p&gt;&lt;p&gt;The worst part was that I should know better. I am the person at Every who writes about writing with AI while using AI to write about writing with AI. I have custom agents, &lt;u&gt;&lt;a href="https://every.to/guides/ai-style-guide?source=post_button" rel="noopener noreferrer" target="_blank"&gt;style guides&lt;/a&gt;&lt;/u&gt;, editorial workflows, and an apparently bottomless appetite for turning every lesson into a system. And still, I had &lt;u&gt;&lt;a href="https://every.to/working-overtime/we-need-to-talk-about-ai-autopilot" rel="noopener noreferrer" target="_blank"&gt;let the machine’s smoothness&lt;/a&gt;&lt;/u&gt; pass for my own judgment enough times that my editors felt the need to intervene.&lt;/p&gt;&lt;p&gt;After the meeting, I did what I generally do when I learn something new, embarrassing or otherwise: I baked it into documentation for my agents. I opened the notes, pulled out the patterns Kate had flagged, and listed them in a new skill called /guardrails, which turns any agent I write with into an exacting editorial specialist that keeps me honest.&lt;/p&gt;&lt;p&gt;I’ll never be completely done with /guardrails, or any of the review skills like it that I’ve built, because my human tics and tendencies will move around like a squirmy toddler. But I’d rather make new mistakes than keep repeating the old ones. Review skills are the mechanism by which I do that. They’re another form of editor, one that can catch a draft’s more annoying weak spots before they become a human editor’s problem. &lt;/p&gt;&lt;div class="quill-block-image" id="quill-block-image-1780925110415-x51ztztds" data-source="{&amp;quot;dom_id&amp;quot;:&amp;quot;quill-block-image-1780925110415-x51ztztds&amp;quot;,&amp;quot;link&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4270/optimized_f33ae1cf-f442-4546-ba0c-fe99a49747c8.png&amp;quot;,&amp;quot;image&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4270/optimized_f33ae1cf-f442-4546-ba0c-fe99a49747c8.png&amp;quot;,&amp;quot;caption&amp;quot;:&amp;quot;The start of my guardrails skill, where I’ve compiled all the particular ways that content I submit can fall below par. (All images courtesy of Katie Parrott.)&amp;quot;,&amp;quot;error&amp;quot;:null}"&gt;&lt;div&gt;&lt;a href="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4270/optimized_f33ae1cf-f442-4546-ba0c-fe99a49747c8.png" target="_blank" rel="noopener noreferrer"&gt;&lt;img src="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4270/optimized_f33ae1cf-f442-4546-ba0c-fe99a49747c8.png" alt="The start of my guardrails skill, where I’ve compiled all the particular ways that content I submit can fall below par. (All images courtesy of Katie Parrott.)"&gt;&lt;/a&gt;&lt;figcaption class="quill-image-caption"&gt;The start of my guardrails skill, where I’ve compiled all the particular ways that content I submit can fall below par. (All images courtesy of Katie Parrott.)&lt;/figcaption&gt;&lt;/div&gt;&lt;/div&gt;&lt;p&gt;&lt;br&gt;&lt;/p&gt;&lt;p&gt;Writing with AI tends to be portrayed as a bargain: The machine does more, so the human does less. But in my experience—a microcosm of Every CEO &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@danshipper" rel="noopener noreferrer" target="_blank"&gt;Dan Shipper&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;’s argument in &lt;u&gt;&lt;a href="https://every.to/p/after-automation" rel="noopener noreferrer" target="_blank"&gt;“After Automation”&lt;/a&gt;&lt;/u&gt;—it changes what the human does instead of reducing the workload. I have to be clear about defining my standards so a model can understand them. That creates more work—but it helps me &lt;u&gt;&lt;a href="https://every.to/working-overtime/i-taught-claude-every-s-standards-it-taught-me-mine" rel="noopener noreferrer" target="_blank"&gt;understand them better myself&lt;/a&gt;&lt;/u&gt;. &lt;/p&gt;&lt;p&gt;Setting up reviews like /guardrails takes time, attention, and a certain comfort with a tool like &lt;u&gt;&lt;a href="https://every.to/guides/codex-for-knowledge-work" rel="noopener noreferrer" target="_blank"&gt;Codex&lt;/a&gt;&lt;/u&gt; or Claude Code. But once the reviewers are in place and working, I can spend more of my time pushing the draft from good to great. My drafts are now much cleaner and my own preferences are less of a mystery to myself, because I’ve had to think and talk about them enough that they’ve worn new grooves into my brain. &lt;/p&gt;&lt;p&gt;I’m going to show you a few of the reviewers I rely on and what goes into them (I’ll share a set on Every’s GitHub along with this piece). But it should serve as an example, not a blueprint; the special sauce of this process comes from setting and enforcing your own collection of style requirements. &lt;/p&gt;&lt;h2&gt;Skills rule everything around me&lt;/h2&gt;&lt;p&gt;In the beginning of any good guardrail system, there are &lt;strong&gt;skills. &lt;/strong&gt;&lt;/p&gt;&lt;p&gt;At the mechanical level, a skill is a Markdown file with instructions inside it. Practically, it’s a way of packaging judgment. When I invoke the guardrails skill, I am asking the model to read a draft through a set of lenses: Look for AI tells, vague claims, hedges, limp openings, and all the little ways a zombie draft can pass as finished without a pulse.&lt;/p&gt;&lt;p&gt;I’ve become fanatical about naming conventions. After all, skill names have to be sticky enough that you remember them when you need them—although this gets less true with every model release, as AI becomes better at deciding which tools it needs to do the job. Still, “assess narrative momentum” sounds like a task someone puts in a project management tool shortly before everyone involved loses the will to live. Instead of clinical descriptors, I’ve given my more editorial skills their own personas: &lt;strong&gt;Sorkin&lt;/strong&gt; is a reviewer with a job. He wants to keep the piece walking and talking, not mired in unnecessary specifics. Similarly, &lt;strong&gt;Mom&lt;/strong&gt; wants to know where a reader who’s not as AI-pilled as I am might get lost. &lt;strong&gt;Asshole&lt;/strong&gt; wants to attack the weakest version of the argument, which is annoying because sometimes the weakest version of the argument is the one I wrote.&lt;/p&gt;&lt;p&gt;Each of these reviewers asks a different question. Together, they give me a way to pressure-test a draft before I hand it to a human editor whose attention I would prefer is spent on problems only a human editor can solve. Our brains belong on the piece’s angle, claim, storytelling, and audience fit. You know, the fun stuff, with some stakes attached. &lt;/p&gt;&lt;h2&gt;Running the guardrail gauntlet&lt;/h2&gt;&lt;p&gt;Here’s an image to give you a sense of what a typical final review looks like before I hand a piece to an editor: &lt;/p&gt;&lt;div class="quill-block-image" id="quill-block-image-1780925110418-ukmdkts6x" data-source="{&amp;quot;dom_id&amp;quot;:&amp;quot;quill-block-image-1780925110418-ukmdkts6x&amp;quot;,&amp;quot;link&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4270/optimized_71237c30-fb2d-4ab6-9541-53b5daa1c778.png&amp;quot;,&amp;quot;image&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4270/optimized_71237c30-fb2d-4ab6-9541-53b5daa1c778.png&amp;quot;,&amp;quot;caption&amp;quot;:&amp;quot;A comprehensive rundown of 12 agents reviewing a draft ahead of submitting to an editor.&amp;quot;,&amp;quot;error&amp;quot;:null}"&gt;&lt;div&gt;&lt;a href="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4270/optimized_71237c30-fb2d-4ab6-9541-53b5daa1c778.png" target="_blank" rel="noopener noreferrer"&gt;&lt;img src="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4270/optimized_71237c30-fb2d-4ab6-9541-53b5daa1c778.png" alt="A comprehensive rundown of 12 agents reviewing a draft ahead of submitting to an editor."&gt;&lt;/a&gt;&lt;figcaption class="quill-image-caption"&gt;A comprehensive rundown of 12 agents reviewing a draft ahead of submitting to an editor.&lt;/figcaption&gt;&lt;/div&gt;&lt;/div&gt;&lt;p&gt;&lt;br&gt;&lt;/p&gt;&lt;p&gt;This preflight checklist comes after I’ve drafted, argued with, rewritten, and reread a piece many times. By this stage, the writing is mine, whether AI was involved in the drafting or not. AI’s job is to look for the things I am most likely to miss because I am tired, emotionally attached, or too deep inside the logic of the draft to see the true shape of it anymore.&lt;/p&gt;&lt;p&gt;This final lineup of reviews may look excessive—and it is. When a model can help you get to a plausible draft quickly, the danger is writing that sounds profound at first glance but is, in fact, just AI being pleased with itself. I designed my reviewers to be the closest inspection possible.&lt;/p&gt;&lt;h2&gt;Different reviewers for different stages&lt;/h2&gt;&lt;p&gt;With my guardrails in place, the real fun is channeling all my anxious energy into invoking them through every stage of the writing process.&lt;/p&gt;&lt;p&gt;At the outline stage, I want pressure on the argument. While drafting, I want line-level reviews that catch my tics. Once the full draft exists, I want readers arguing with the piece as a whole. And before I hand it to an editor, I want the checklist to sweep for everything my tired brain missed.&lt;/p&gt;&lt;div class="quill-block-image" id="quill-block-image-1780925110421-i33089ug5" data-source="{&amp;quot;dom_id&amp;quot;:&amp;quot;quill-block-image-1780925110421-i33089ug5&amp;quot;,&amp;quot;link&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4270/optimized_1d6fed1d-c904-4021-a7f7-fe37e560aa60.png&amp;quot;,&amp;quot;image&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4270/optimized_1d6fed1d-c904-4021-a7f7-fe37e560aa60.png&amp;quot;,&amp;quot;caption&amp;quot;:&amp;quot;The beginning of the guardrails readout for this very draft.&amp;quot;,&amp;quot;error&amp;quot;:null}"&gt;&lt;div&gt;&lt;a href="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4270/optimized_1d6fed1d-c904-4021-a7f7-fe37e560aa60.png" target="_blank" rel="noopener noreferrer"&gt;&lt;img src="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4270/optimized_1d6fed1d-c904-4021-a7f7-fe37e560aa60.png" alt="The beginning of the guardrails readout for this very draft."&gt;&lt;/a&gt;&lt;figcaption class="quill-image-caption"&gt;The beginning of the guardrails readout for this very draft.&lt;/figcaption&gt;&lt;/div&gt;&lt;/div&gt;&lt;h2&gt;Outline: Beat up the argument&lt;/h2&gt;&lt;p&gt;The first guardrail round happens at the outline stage, before I have prose to get precious about. I send the structure to reviewers whose job is to hunt for weak spots. &lt;strong&gt;Hitchcock&lt;/strong&gt; looks for tension. &lt;strong&gt;Sedaris&lt;/strong&gt; looks for a sense of humor. And when I’m really a glutton for punishment, there’s /asshole, which is set up to deliver the least-charitable interpretation of the argument possible so I can shore it up.&lt;/p&gt;&lt;div class="quill-block-image" id="quill-block-image-1780925110423-7y6stha18" data-source="{&amp;quot;dom_id&amp;quot;:&amp;quot;quill-block-image-1780925110423-7y6stha18&amp;quot;,&amp;quot;link&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4270/optimized_8f31ad5f-2215-4173-9d87-2e1c295e1c2d.png&amp;quot;,&amp;quot;image&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4270/optimized_8f31ad5f-2215-4173-9d87-2e1c295e1c2d.png&amp;quot;,&amp;quot;caption&amp;quot;:&amp;quot;The Hitchcock skill reviews the draft for suspense. Is there a “bomb under the table” that will make the reader want to keep reading?&amp;quot;,&amp;quot;error&amp;quot;:null}"&gt;&lt;div&gt;&lt;a href="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4270/optimized_8f31ad5f-2215-4173-9d87-2e1c295e1c2d.png" target="_blank" rel="noopener noreferrer"&gt;&lt;img src="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4270/optimized_8f31ad5f-2215-4173-9d87-2e1c295e1c2d.png" alt="The Hitchcock skill reviews the draft for suspense. Is there a “bomb under the table” that will make the reader want to keep reading?"&gt;&lt;/a&gt;&lt;figcaption class="quill-image-caption"&gt;The Hitchcock skill reviews the draft for suspense. Is there a “bomb under the table” that will make the reader want to keep reading?&lt;/figcaption&gt;&lt;/div&gt;&lt;/div&gt;&lt;p&gt;&lt;br&gt;&lt;/p&gt;&lt;p&gt;At this stage, I want structural trouble in bright red, whether it’s a claim the reader can’t follow, a section that arrives too early, or a setup with no payoff. The outline is where I can still rearrange and reframe big chunks of the work to make the strongest argument.&lt;/p&gt;&lt;h2&gt;Section drafts: Catch weak prose early&lt;/h2&gt;&lt;p&gt;By the time I’m ready to draft, I’m in full chaos mode, so I lean on the skills that help me clean things up. I write section by section, and after each section I run /ai-check and /guardrails before moving on. I am trying to catch that tell-tale smoothing while it is still local. &lt;/p&gt;&lt;p&gt;I also run senior editor &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@jackcheng" rel="noopener noreferrer" target="_blank"&gt;Jack Cheng&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;’s /tighten-draft skill. Jack built the skill around his years of editorial instincts to spot the kinds of bloat that can accumulate in a draft and make it a chore for a reader to get through. That is one of the pleasures of skills: Editorial &lt;u&gt;&lt;a href="https://every.to/p/what-is-taste-really" rel="noopener noreferrer" target="_blank"&gt;taste&lt;/a&gt;&lt;/u&gt; becomes something you can share. My workflow can carry my standards, Kate’s feedback, Jack’s editing rules, Every’s house style, and whatever strange little reviewer I decided to invent at 11 p.m. because a draft was annoying me in a new way.&lt;/p&gt;&lt;h2&gt;Full draft: Make the piece argue back&lt;/h2&gt;&lt;p&gt;Once I have a full draft, the focus of the review moves from the details back to the big picture. I run a developmental review for argument, structure, stakes, and payoff. Then I run a column-specific review: /working-overtime for this piece, or whichever column skill fits the draft. That pass checks to see if the essay makes all of my signature structural moves as outlined in my style guide: Start from lived friction, show the messy middle, connect the personal to the larger AI-era work question, and hand the reader something usable.&lt;/p&gt;&lt;p&gt;This is also where I call on a more sophisticated form of skill-building: orchestration, or the threading together of multiple skills and agents to execute a more complex task. I’ve created a command called /panel, which convenes a set of my reviewers for a more holistic review. First, the panel reads the draft’s context: piece type, audience, stage, and goals, all of which it knows from the interview stage and my Working Overtime style guide. Then it proposes the reviewers suited to the problem in front of it—or lets me pick. For this draft, I had it run Mom for accessibility, Hitchcock for tension, Sorkin for momentum, and Sedaris for humor. This is different from summoning the agents one by one, because the output is a synthesis of all their feedback, rather than checking for one specific thing.  &lt;/p&gt;&lt;div class="quill-block-image" id="quill-block-image-1780925110423-x6t2jj59m" data-source="{&amp;quot;dom_id&amp;quot;:&amp;quot;quill-block-image-1780925110423-x6t2jj59m&amp;quot;,&amp;quot;link&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4270/optimized_6cff6cc0-c898-4361-a6b0-75f2722e8752.png&amp;quot;,&amp;quot;image&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4270/optimized_6cff6cc0-c898-4361-a6b0-75f2722e8752.png&amp;quot;,&amp;quot;caption&amp;quot;:&amp;quot;/panel summons a lineup of reviewers to render their verdict on a draft trending toward submission.&amp;quot;,&amp;quot;error&amp;quot;:null}"&gt;&lt;div&gt;&lt;a href="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4270/optimized_6cff6cc0-c898-4361-a6b0-75f2722e8752.png" target="_blank" rel="noopener noreferrer"&gt;&lt;img src="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4270/optimized_6cff6cc0-c898-4361-a6b0-75f2722e8752.png" alt="/panel summons a lineup of reviewers to render their verdict on a draft trending toward submission."&gt;&lt;/a&gt;&lt;figcaption class="quill-image-caption"&gt;/panel summons a lineup of reviewers to render their verdict on a draft trending toward submission.&lt;/figcaption&gt;&lt;/div&gt;&lt;/div&gt;&lt;p&gt;&lt;br&gt;&lt;/p&gt;&lt;p&gt;After the reviewers run in parallel, a synthesizer reads their feedback together. It looks for consensus findings, productive tensions, unique insights, recommended priorities, and the single hard question the draft keeps circling. One reviewer may want a section cut. Another may think the same section is the most alive thing in the draft. The synthesis keeps that disagreement intact, because the tension tells me what decision the piece is asking me to make. Then I, the human, have to figure out how to make it.&lt;/p&gt;&lt;p&gt;Here’s a final big-picture question the panel on this draft left me with: &lt;/p&gt;&lt;p&gt;&lt;em&gt;Is this an essay about a system that’s working, or an essay about a writer who has to keep building infrastructure because her own defaults keep losing? The reviewers can tell you the piece currently splits the difference—selling the system in the middle, hinting at the slippage in the corners. The opening, middle, and ending all change depending on the answer. Hitchcock said it cleanest: The melodrama only works if it’s true. So, is it?&lt;/em&gt;&lt;/p&gt;&lt;h2&gt;Final pass: Run the team checklist &lt;/h2&gt;&lt;p&gt;Finally, I run the draft through the team’s accumulated editorial muscle memory, as it’s been packaged in our shared skills. I rerun /ai-check, because revisions can reintroduce assistant voice. I rerun /guardrails, and more often than not, it finds tics I’ve either added or been too enamored with to kill earlier in the process.   &lt;/p&gt;&lt;p&gt;Then I run /tighten-draft again, plus /kate-top-edit, a skill built around Kate’s review expectations: vague “this” openers, unsourced data, floating quotes, unidentified people, unexplained jargon, missing Every links, hedges, marketing language, sentence fragments, and the AI tells our editorial team has learned to avoid at all costs. &lt;/p&gt;&lt;h2&gt;Human review is the ultimate fine tuning&lt;/h2&gt;&lt;p&gt;By this point in the workflow, the whole system can look like protection from AI. The fuller truth is that a lot of it is protection from my own blind spots when I’m too tired to live up to my best self. Writing with AI requires, in the words of &lt;em&gt;Harry Potter’s&lt;/em&gt; Mad-Eye Moody, “constant vigilance.” But then, writing without AI does, too. &lt;/p&gt;&lt;p&gt;Some reviewers catch model habits, like overly clean transitions and sentences that use symmetry as a substitute for thought. Others catch Katie habits: phrases I reach for too often, clever constructions I enjoy more than the reader does, and little rhetorical moves that feel sharp in the moment and embarrassing by the next morning.&lt;/p&gt;&lt;p&gt;The process is ongoing and imperfect. We are fallible creatures, as are our machines, so there will always be new quirks to banish. It’s important to be in the habit of updating old skills with new items to watch for, and creating new skills to enforce standards that you discover along the way.&lt;/p&gt;&lt;p&gt;That is why I am pairing this piece with a repo. I want you to be able to pull down the skills, open them up, and make them useful for your own AI writing adventures—and foibles.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Here’s how to get started: &lt;/strong&gt;&lt;/p&gt;&lt;ol&gt;&lt;li&gt;Open &lt;u&gt;&lt;a href="https://github.com/EveryInc/draft-review-kit" rel="noopener noreferrer" target="_blank"&gt;the GitHub repository&lt;/a&gt;&lt;/u&gt;. Click “Code,” then “Download ZIP,” and unzip the folder.&lt;/li&gt;&lt;li&gt;Open the folder in the Codex or Claude app.&lt;/li&gt;&lt;li&gt;Tell your agent: “Read the README and help me install these reviewer skills. Explain each step in plain language.”&lt;/li&gt;&lt;li&gt;Choose one reviewer and run it on a draft you know well.&lt;/li&gt;&lt;li&gt;Note what it caught, what it got wrong, and what it missed. Ask your agent to update the reviewer’s SKILL.md with that feedback.&lt;/li&gt;&lt;li&gt;Run it again and repeat until the feedback is useful.&lt;/li&gt;&lt;/ol&gt;&lt;p&gt;You do not need to fork the repository to get started. Forking creates your own copy on GitHub, which is useful if you want to track your changes or receive future updates.&lt;/p&gt;&lt;p&gt;Grab the repo. Give the reviewers meaningful names. Delete the ones that are pure Katie pathology. Add new ones to prevent your own bad habits, whether that’s legal overreach, unsupported claims, faux profundity, jargon, or hype. Teach the machine to protect the work from default settings—the machine’s, but more importantly, yours. &lt;/p&gt;&lt;div class="quill-button" data-source="{&amp;quot;id&amp;quot;:&amp;quot;quill-button-1780925261814&amp;quot;,&amp;quot;text&amp;quot;:&amp;quot;Download Katie's draft review kit on GitHub&amp;quot;,&amp;quot;url&amp;quot;:&amp;quot;https://github.com/EveryInc/draft-review-kit?source=post_button&amp;quot;}" id="quill-button-1780925261814"&gt;&lt;a href="https://github.com/EveryInc/draft-review-kit?source=post_button"&gt;Download Katie's draft review kit on GitHub&lt;/a&gt;&lt;/div&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;&lt;strong&gt;&lt;em&gt;&lt;a href="https://every.to/@katie.parrott12" rel="noopener noreferrer" target="_blank"&gt;Katie Parrott&lt;/a&gt;&lt;/em&gt;&lt;/strong&gt; &lt;em&gt;is a staff writer at Every. You can read more of her work in&lt;/em&gt; &lt;em&gt;&lt;a href="https://katieparrott.substack.com/" rel="noopener noreferrer" target="_blank"&gt;her newsletter&lt;/a&gt;. To read more essays like this, subscribe to &lt;u&gt;&lt;a href="https://every.to/subscribe" rel="noopener noreferrer" target="_blank"&gt;Every&lt;/a&gt;&lt;/u&gt;, and follow us on X at &lt;u&gt;&lt;a href="http://twitter.com/every" rel="noopener noreferrer" target="_blank"&gt;@every&lt;/a&gt;&lt;/u&gt; and on &lt;u&gt;&lt;a href="https://www.linkedin.com/company/everyinc/" rel="noopener noreferrer" target="_blank"&gt;LinkedIn&lt;/a&gt;&lt;/u&gt;.&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;Discover Every’s &lt;u&gt;&lt;a href="https://every.to/events" rel="noopener noreferrer" target="_blank"&gt;upcoming workshops and camps&lt;/a&gt;&lt;/u&gt;, and access recordings from past events.&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;For sponsorship opportunities, reach out to sponsorships@every.to.&lt;/em&gt;&lt;/p&gt;&lt;div class="quill-button" data-source="{&amp;quot;id&amp;quot;:&amp;quot;quill-button-1769187301610&amp;quot;,&amp;quot;url&amp;quot;:&amp;quot;https://every.to/subscribe?source=post_button&amp;quot;,&amp;quot;text&amp;quot;:&amp;quot;Subscribe&amp;quot;}" id="quill-button-1769187301610"&gt;&lt;a href="https://every.to/subscribe?source=post_button"&gt;Subscribe&lt;/a&gt;&lt;/div&gt;</description>
      <author>Katie Parrott / Working Overtime</author>
      <pubDate>2026-06-08 18:00:00 -0400</pubDate>
      <guid>https://every.to/working-overtime/my-editor-caught-me-sounding-like-ai-now-ai-catches-me-first</guid>
      <link>https://every.to/working-overtime/my-editor-caught-me-sounding-like-ai-now-ai-catches-me-first</link>
    </item>
    <item>
      <title>AI Is Ready. Organizations Aren’t.</title>
      <description>&lt;table&gt;&lt;tr&gt;&lt;td&gt;&lt;/td&gt;&lt;td&gt;&lt;table&gt;&lt;tr&gt;&lt;td&gt;by &lt;a href="https://every.to/@Every%20Staff" itemprop="name"&gt;Every Staff&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;figure&gt;&lt;img src="https://d24ovhgu8s7341.cloudfront.net/uploads/post/cover/4291/full_page_cover_7f71563100ad04dc-CW-02.png"&gt;&lt;figcaption&gt;Midjourney/Every illustration.&lt;/figcaption&gt;&lt;/figure&gt;&lt;p&gt;Hello, and happy Sunday! This week the consulting team published two practical guides. &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@mike_2114" rel="noopener noreferrer" target="_blank"&gt;Mike Taylor&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; built on engineer &lt;strong&gt;Steve Yegge&lt;/strong&gt;’s viral post to map the &lt;u&gt;&lt;a href="https://every.to/guides/the-eight-levels-of-ai-adoption" rel="noopener noreferrer" target="_blank"&gt;eight levels of AI adoption&lt;/a&gt;&lt;/u&gt;—with sample prompts and signals for when to move up—and &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@natalia_2944" rel="noopener noreferrer" target="_blank"&gt;Natalia Quintero&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; (who’s &lt;u&gt;&lt;a href="https://every.to/p/i-talked-to-more-than-100-companies-about-ai-here-s-what-s-actually-working" rel="noopener noreferrer" target="_blank"&gt;talked to leadership teams&lt;/a&gt;&lt;/u&gt; at hundreds of organizations) laid out a foolproof &lt;u&gt;&lt;a href="https://every.to/guides/an-executive-s-guide-to-implementing-ai" rel="noopener noreferrer" target="_blank"&gt;five-step process&lt;/a&gt;&lt;/u&gt; for executives rolling out AI across their companies. Covering &lt;u&gt;&lt;a href="https://every.to/also-true-for-humans/how-microsoft-is-building-for-a-world-of-metered-intelligence" rel="noopener noreferrer" target="_blank"&gt;Microsoft Build&lt;/a&gt;&lt;/u&gt;, Mike argued that &lt;u&gt;&lt;a href="https://every.to/context-window/why-we-ll-still-be-employed-when-ai-can-do-everything" rel="noopener noreferrer" target="_blank"&gt;enterprise adoption&lt;/a&gt;&lt;/u&gt; lags the news cycle—a gap he sees up close with the enterprise clients he advises. He also made a counterargument to &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@danshipper" rel="noopener noreferrer" target="_blank"&gt;Dan Shipper&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;’s essay about the future of work, &lt;u&gt;&lt;a href="https://every.to/p/after-automation" rel="noopener noreferrer" target="_blank"&gt;“After Automation.”&lt;/a&gt;&lt;/u&gt;&lt;/p&gt;&lt;p&gt;&lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/on-every/spiral-4-0-goes-agent-native" rel="noopener noreferrer" target="_blank"&gt;Spiral&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;&lt;u&gt;&lt;a href="https://every.to/on-every/spiral-4-0-goes-agent-native" rel="noopener noreferrer" target="_blank"&gt; 4.0&lt;/a&gt;&lt;/u&gt; shipped this week: Every’s writing tool can now draft in your voice from inside any agent, with a price cut to match. Elsewhere, Figma’s &lt;strong&gt;Matt Colyer&lt;/strong&gt; makes the case that the &lt;u&gt;&lt;a href="https://every.to/podcast/figma-exec-on-why-the-saaspocalypse-is-a-goldmine" rel="noopener noreferrer" target="_blank"&gt;SaaSpocalypse&lt;/a&gt;&lt;/u&gt; is overblown, designer &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@daniel_5fbd21_1" rel="noopener noreferrer" target="_blank"&gt;Daniel Rodrigues&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; shares a two-tool &lt;u&gt;&lt;a href="https://every.to/context-window/opus-4-8-is-smart-enough-to-get-in-your-way" rel="noopener noreferrer" target="_blank"&gt;image generators&lt;/a&gt;&lt;/u&gt; workflow, &lt;strong&gt;&lt;u&gt;&lt;a href="https://www.monologue.to/" rel="noopener noreferrer" target="_blank"&gt;Monologue&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; general manager &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@naveen_6804" rel="noopener noreferrer" target="_blank"&gt;Naveen Naidu&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; has a system for making &lt;u&gt;&lt;a href="https://every.to/context-window/why-we-ll-still-be-employed-when-ai-can-do-everything" rel="noopener noreferrer" target="_blank"&gt;coding agents&lt;/a&gt;&lt;/u&gt; more efficient with custom local skills, and the team names its most &lt;u&gt;&lt;a href="https://every.to/context-window/why-we-ll-still-be-employed-when-ai-can-do-everything" rel="noopener noreferrer" target="_blank"&gt;annoying model output&lt;/a&gt;&lt;/u&gt;.&lt;em&gt;—&lt;u&gt;&lt;a href="https://every.to/@kate_1767" rel="noopener noreferrer" target="_blank"&gt;Kate Lee&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;Was this newsletter forwarded to you? &lt;u&gt;&lt;a href="https://every.to/account" rel="noopener noreferrer" target="_blank"&gt;Sign up&lt;/a&gt;&lt;/u&gt; to get it in your inbox&lt;/em&gt;.&lt;/p&gt;&lt;h2&gt;&lt;hr class="quill-line"&gt;&lt;/h2&gt;&lt;h2&gt;Knowledge base&lt;/h2&gt;&lt;p&gt;&lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/guides/the-eight-levels-of-ai-adoption" rel="noopener noreferrer" target="_blank"&gt;“The Eight Levels of AI Adoption”&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;&lt;em&gt; by &lt;u&gt;&lt;a href="https://every.to/@mike_2114" rel="noopener noreferrer" target="_blank"&gt;Mike Taylor&lt;/a&gt;&lt;/u&gt; and &lt;u&gt;&lt;a href="https://every.to/@laura_27bbaf_1" rel="noopener noreferrer" target="_blank"&gt;Laura Entis&lt;/a&gt;&lt;/u&gt;/&lt;u&gt;&lt;a href="https://every.to/guides" rel="noopener noreferrer" target="_blank"&gt;Guides&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;: A framework mapping every stage of AI adoption, from Level 1 (a chatbot you ask and it answers) to Level 8 (an orchestrator agent that runs a team of sub-agents), with example prompts and guidance on when to move up. A higher level isn’t automatically better—the right level for a task depends on how much you trust the AI to run without intervention and how costly a mistake would be. A &lt;u&gt;&lt;a href="https://every.to/p/where-do-you-fall-on-the-eight-levels-of-ai-adoption" rel="noopener noreferrer" target="_blank"&gt;companion essay&lt;/a&gt;&lt;/u&gt; lets you figure out which level you’re on. Read this for where you stand today and what it takes to move up a level.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/guides/an-executive-s-guide-to-implementing-ai" rel="noopener noreferrer" target="_blank"&gt;“An Executive’s Guide to Implementing AI”&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;&lt;em&gt; by &lt;u&gt;&lt;a href="https://every.to/@natalia_2944" rel="noopener noreferrer" target="_blank"&gt;Natalia Quintero&lt;/a&gt;&lt;/u&gt;/&lt;u&gt;&lt;a href="https://every.to/guides" rel="noopener noreferrer" target="_blank"&gt;Guides&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;: AI adoption isn’t being held back by the models—it’s the organization. &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@natalia_2944" rel="noopener noreferrer" target="_blank"&gt;Natalia Quintero&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;, head of Every Consulting, gives executives who’ve bought the tools but aren’t seeing returns a five-step framework, laid out as a 60-day plan—with a &lt;u&gt;&lt;a href="https://every.to/p/company-wide-ai-implementation-in-five-steps" rel="noopener noreferrer" target="_blank"&gt;companion essay&lt;/a&gt;&lt;/u&gt; that previews it. Read this for the five steps and how to run them.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/also-true-for-humans/how-microsoft-is-building-for-a-world-of-metered-intelligence" rel="noopener noreferrer" target="_blank"&gt;“How Microsoft Is Building for a World of Metered Intelligence”&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;&lt;em&gt; by &lt;u&gt;&lt;a href="https://every.to/@mike_2114" rel="noopener noreferrer" target="_blank"&gt;Mike Taylor&lt;/a&gt;&lt;/u&gt;/&lt;u&gt;&lt;a href="https://every.to/also-true-for-humans" rel="noopener noreferrer" target="_blank"&gt;Also True for Humans&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;: Reporting from Microsoft Build, &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@mike_2114" rel="noopener noreferrer" target="_blank"&gt;Mike Taylor&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; argues that Microsoft is the first big company to design for a world where intelligence is metered and the era of subsidized AI subscriptions is ending. Its response includes automatic model routing, a laptop that runs AI locally, and cheaper, smaller models. Read this for a ground-level look at AI’s post-subsidy era.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/context-window/why-we-ll-still-be-employed-when-ai-can-do-everything" rel="noopener noreferrer" target="_blank"&gt;“Why We’ll Still Be Employed When AI Can Do Everything”&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;&lt;em&gt; by &lt;u&gt;&lt;a href="https://every.to/@laura_27bbaf_1" rel="noopener noreferrer" target="_blank"&gt;Laura Entis&lt;/a&gt;&lt;/u&gt;/&lt;u&gt;&lt;a href="https://every.to/context-window" rel="noopener noreferrer" target="_blank"&gt;Context Window&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;: In a counterpoint to Dan’s &lt;u&gt;&lt;a href="https://every.to/p/after-automation" rel="noopener noreferrer" target="_blank"&gt;“After Automation”&lt;/a&gt;&lt;/u&gt; essay, Mike argues that even after AI can outwork people at well-run companies, running it will cost so much energy and compute that hiring a person is often cheaper. Read this for a grounded take on the AI-employment debate.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/context-window/opus-4-8-is-smart-enough-to-get-in-your-way" rel="noopener noreferrer" target="_blank"&gt;“Opus 4.8 Is Smart Enough to Get in Your Way”&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;&lt;em&gt; by &lt;u&gt;&lt;a href="https://every.to/@laura_27bbaf_1" rel="noopener noreferrer" target="_blank"&gt;Laura Entis&lt;/a&gt;&lt;/u&gt;/&lt;u&gt;&lt;a href="https://every.to/context-window" rel="noopener noreferrer" target="_blank"&gt;Context Window&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;: A week after our &lt;u&gt;&lt;a href="https://every.to/vibe-check/opus-4-8-vibecheck" rel="noopener noreferrer" target="_blank"&gt;Opus 4.8 Vibe Check&lt;/a&gt;&lt;/u&gt;, we check back in—now that the public has reacted and more of the Every team is using it daily—and our initial read holds: It’s strong on dense, long-running work but quick to get in its own way. Read this for how the verdict looks a week on.&lt;/p&gt;&lt;p&gt;🖥 &lt;strong&gt;&lt;u&gt;&lt;a href="https://www.youtube.com/watch?v=2Yy8EJlZJFE" rel="noopener noreferrer" target="_blank"&gt;“Codex Runs My Inbox Now”&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;&lt;em&gt; by &lt;u&gt;&lt;a href="https://every.to/@danshipper" rel="noopener noreferrer" target="_blank"&gt;Dan Shipper&lt;/a&gt;&lt;/u&gt;/&lt;u&gt;&lt;a href="https://www.youtube.com/@EveryInc" rel="noopener noreferrer" target="_blank"&gt;Every&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;: Dan&lt;strong&gt; &lt;/strong&gt;shows the workflow that’s kept him at inbox zero for 13 weeks straight—a Codex-native app that pulls his emails, Slack messages, meetings, and company context into review cards, drafts the next action on each, and learns from every decision. It shows Codex working as an operating system for knowledge work and ends with the full prompt to build the app yourself. Read this for the inbox-sweep workflow and the prompt to copy.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/podcast/figma-exec-on-why-the-saaspocalypse-is-a-goldmine" rel="noopener noreferrer" target="_blank"&gt;“Figma Exec on Why the SaaSpocalypse Is a Goldmine”&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;&lt;em&gt; by &lt;u&gt;&lt;a href="https://every.to/@danshipper" rel="noopener noreferrer" target="_blank"&gt;Dan Shipper&lt;/a&gt;&lt;/u&gt;/&lt;u&gt;&lt;a href="https://every.to/podcast" rel="noopener noreferrer" target="_blank"&gt;AI &amp;amp; I&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;: &lt;strong&gt;Matt Colyer&lt;/strong&gt;, Figma’s director of product management for developers, argues that the “SaaSpocalypse”—the fear that vibe coding will kill software by letting anyone build their own tools—has the economics backward: AI expanded the developer base, so more software gets built and software becomes more valuable, not less. Watch or listen to this for the clearest reframe of the vibe-coding-kills-SaaS panic. 🎧 🖥 Listen on &lt;u&gt;&lt;a href="https://open.spotify.com/episode/4qTiIlvhxgnGI0cG06aFw5" rel="noopener noreferrer" target="_blank"&gt;Spotify&lt;/a&gt;&lt;/u&gt; or &lt;u&gt;&lt;a href="https://podcasts.apple.com/us/podcast/ai-i/id1719789201" rel="noopener noreferrer" target="_blank"&gt;Apple Podcasts&lt;/a&gt;&lt;/u&gt;, watch on &lt;u&gt;&lt;a href="https://www.youtube.com/watch?v=kYKebKB3-d0" rel="noopener noreferrer" target="_blank"&gt;YouTube&lt;/a&gt;&lt;/u&gt;, or follow the discussion &lt;u&gt;&lt;a href="https://x.com/danshipper/status/2062202908306030915" rel="noopener noreferrer" target="_blank"&gt;on X&lt;/a&gt;&lt;/u&gt;.&lt;/p&gt;&lt;p&gt;&lt;br&gt;&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;h2&gt;Log on&lt;/h2&gt;&lt;p&gt;Get hands-on with how Every uses AI. These are the &lt;u&gt;&lt;a href="https://every.to/events" rel="noopener noreferrer" target="_blank"&gt;live camps, workshops, and meetups&lt;/a&gt;&lt;/u&gt; where team members teach the workflows behind our work.&lt;/p&gt;&lt;h5&gt;Upcoming camp&lt;/h5&gt;&lt;ul&gt;&lt;li&gt;&lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/events/codex-camp-our-power-user-guide" rel="noopener noreferrer" target="_blank"&gt;Codex Camp: Our Power User Guide&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;: On June 12, Dan and the Every team host a two-hour live walkthrough of the &lt;u&gt;&lt;a href="https://every.to/guides/codex-for-knowledge-work" rel="noopener noreferrer" target="_blank"&gt;Codex power-user guide&lt;/a&gt;&lt;/u&gt;—setup, workflows, and Codex-native app development. &lt;u&gt;&lt;a href="https://every.to/events/codex-camp-our-power-user-guide" rel="noopener noreferrer" target="_blank"&gt;RSVP&lt;/a&gt;&lt;/u&gt;.&lt;/li&gt;&lt;/ul&gt;&lt;h5&gt;Recordings you may have missed&lt;/h5&gt;&lt;ul&gt;&lt;li&gt;&lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/events/compound-engineering-camp-3" rel="noopener noreferrer" target="_blank"&gt;Compound Engineering Camp&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;: &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@kieran_1355" rel="noopener noreferrer" target="_blank"&gt;Kieran Klaassen&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; and contributor &lt;strong&gt;Trevin Chow&lt;/strong&gt; walk through compound engineering, Every’s AI-native development workflow. &lt;u&gt;&lt;a href="https://www.youtube.com/watch?v=xN7LA-2l5wY" rel="noopener noreferrer" target="_blank"&gt;Watch the recording&lt;/a&gt;&lt;/u&gt;.&lt;/li&gt;&lt;li&gt;&lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/events/executive-AI-sessions" rel="noopener noreferrer" target="_blank"&gt;Executive AI Sessions&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;: Natalia introduces Every Consulting’s new offering for leadership teams navigating AI adoption. &lt;u&gt;&lt;a href="https://www.youtube.com/watch?v=xvnP38BqmkI" rel="noopener noreferrer" target="_blank"&gt;Watch the recording&lt;/a&gt;&lt;/u&gt;.&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;h2&gt;From Every Studio&lt;/h2&gt;&lt;h5&gt;Spiral 4.0 ships agent-native access and a price cut&lt;/h5&gt;&lt;p&gt;&lt;strong&gt;&lt;u&gt;&lt;a href="https://writewithspiral.com" rel="noopener noreferrer" target="_blank"&gt;Spiral&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;, Every’s AI writing tool, &lt;u&gt;&lt;a href="https://every.to/on-every/spiral-4-0-goes-agent-native" rel="noopener noreferrer" target="_blank"&gt;shipped version 4.0&lt;/a&gt;&lt;/u&gt; this week: a new style engine, agent-native access through MCP, CLI, and API, and expanded team workspaces for writing in a shared voice. Pricing moves from sessions to tokens, dropping the personal plan to $15 a month (from $25) and team plans to $25 per user (from $35).&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;h2&gt;Alignment&lt;/h2&gt;&lt;p&gt;&lt;strong&gt;The great wall gets greater. &lt;/strong&gt;Drug companies do not create billion-dollar assets by having ideas. They need human trials to prove the idea works, and China has become exceptionally good at running them. Because the government has made biotechnology innovation a strategic innovation, policy makers have cleared the bureaucracy and regulation obstacles that have long slowed drug development in the U.S. and Europe. As a result, &lt;a href="https://www.bloomberg.com/news/features/2025-07-13/china-drugmakers-catching-up-to-us-big-pharma-with-new-medicine-innovation" rel="noopener noreferrer" target="_blank"&gt;last year &lt;/a&gt;&lt;u&gt;&lt;a href="https://www.bloomberg.com/news/features/2025-07-13/china-drugmakers-catching-up-to-us-big-pharma-with-new-medicine-innovation" rel="noopener noreferrer" target="_blank"&gt;Bloomberg&lt;/a&gt;&lt;/u&gt; reported that China had more than 1,250 novel drugs entering development, close to the U.S. count of about 1,440. A decade ago, Chinese biotech was synonymous with copycat drugs—which makes this a Sputnik moment for the industry. &lt;/p&gt;&lt;p&gt;One key reason for China’s ascendency is that hundreds of millions of patients are concentrated in large urban hospitals, so companies can recruit quickly from a smaller number of high-volume sites. Chinese biotech firms can &lt;u&gt;&lt;a href="https://www.biospace.com/drug-development/clinical-trials-are-increasingly-going-global-with-china-a-main-beneficiary" rel="noopener noreferrer" target="_blank"&gt;reportedly&lt;/a&gt;&lt;/u&gt; complete patient enrollment for a phase 1 or phase 2 trial in nearly half the time a U.S. firm needs. In North America—and even more so in Europe—patients are scattered across fragmented health systems, where every trial site has its own contracts and ethic approvals, each one slow and cumbersome. &lt;/p&gt;&lt;div class="quill-block-image" id="quill-block-image-1780745497895" data-source="{&amp;quot;dom_id&amp;quot;:&amp;quot;quill-block-image-1780745497895&amp;quot;,&amp;quot;link&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4291/optimized_79fd8a23-cb57-4cf1-ad3f-d43a90ef1cfb.png&amp;quot;,&amp;quot;image&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4291/optimized_79fd8a23-cb57-4cf1-ad3f-d43a90ef1cfb.png&amp;quot;,&amp;quot;caption&amp;quot;:&amp;quot;China recruits trial patients in about half the time the U.S. takes and runs far more trials, across anti-cancer and anti-obesity drugs by phase, 2020–2024. (Source: Norstella.)&amp;quot;,&amp;quot;error&amp;quot;:null}"&gt;&lt;div&gt;&lt;a href="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4291/optimized_79fd8a23-cb57-4cf1-ad3f-d43a90ef1cfb.png" target="_blank" rel="noopener noreferrer"&gt;&lt;img src="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4291/optimized_79fd8a23-cb57-4cf1-ad3f-d43a90ef1cfb.png" alt="China recruits trial patients in about half the time the U.S. takes and runs far more trials, across anti-cancer and anti-obesity drugs by phase, 2020–2024. (Source: Norstella.)"&gt;&lt;/a&gt;&lt;figcaption class="quill-image-caption"&gt;China recruits trial patients in about half the time the U.S. takes and runs far more trials, across anti-cancer and anti-obesity drugs by phase, 2020–2024. (Source: Norstella.)&lt;/figcaption&gt;&lt;/div&gt;&lt;/div&gt;&lt;p&gt;&lt;br&gt;&lt;/p&gt;&lt;p&gt;China’s advantage is that it can turn a large population into clinical data much faster—and iterate on feedback loops of drug development to produce assets that are more effective, and thus more valuable to investors and big pharma. Recently, Legend, a Chinese biotech, developed its &lt;u&gt;&lt;a href="https://worksinprogress.co/issue/the-blood-cancer-that-became-solvable/" rel="noopener noreferrer" target="_blank"&gt;own version&lt;/a&gt;&lt;/u&gt; of a largely American drug innovation that treats multiple myeloma, a type of aggressive blood cancer. Chinese drug developers moved quickly into human trials and produced data strong enough for Johnson &amp;amp; Johnson to sign a global licensing and codevelopment deal with Legend worth $350 million upfront. &lt;/p&gt;&lt;p&gt;A second- and third-order consequence of this new landscape is that even U.S. biotech firms may start asking whether it makes sense to take their drugs to China first, at least to complete phase 1 and phase 2 trials. &lt;/p&gt;&lt;p&gt;For now, any drug seeking approval in the U.S. still needs evidence that regulators believe applies to American patients—and in many cases that means later-stage global or Western trials. But it may be inevitable that a Chinese biotech develops a China-originated drug that was licensed into the west without such a step.  &lt;/p&gt;&lt;p&gt;If this change happens—and many believe it will—the next major obesity, oncology, or immunology drug may come from China, marking the same pattern of ascendancy already visible in solar, batteries, and electric vehicles. Biotech may be next.—&lt;em&gt;&lt;u&gt;&lt;a href="https://x.com/Ashwinreads" rel="noopener noreferrer" target="_blank"&gt;Ashwin Sharma&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;That’s all for this week! Be sure to follow Every on X at &lt;u&gt;&lt;a href="https://twitter.com/every" rel="noopener noreferrer" target="_blank"&gt;@every&lt;/a&gt;&lt;/u&gt; and on &lt;u&gt;&lt;a href="https://www.linkedin.com/company/everyinc/" rel="noopener noreferrer" target="_blank"&gt;LinkedIn&lt;/a&gt;&lt;/u&gt;.&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;We &lt;u&gt;&lt;a href="https://every.to/studio" rel="noopener noreferrer" target="_blank"&gt;build AI tools&lt;/a&gt;&lt;/u&gt; for readers like you. Write brilliantly with &lt;u&gt;&lt;a href="https://writewithspiral.com/" rel="noopener noreferrer" target="_blank"&gt;Spiral&lt;/a&gt;&lt;/u&gt;. Organize files automatically with &lt;/em&gt;&lt;strong&gt;&lt;em&gt;&lt;u&gt;&lt;a href="https://makeitsparkle.co/?utm_source=everyfooter" rel="noopener noreferrer" target="_blank"&gt;Sparkle&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;&lt;/strong&gt;&lt;em&gt;. Deliver yourself from email with &lt;/em&gt;&lt;strong&gt;&lt;em&gt;&lt;u&gt;&lt;a href="https://cora.computer" rel="noopener noreferrer" target="_blank"&gt;Cora&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;&lt;/strong&gt;&lt;em&gt;. Dictate effortlessly with &lt;u&gt;&lt;a href="https://monologue.to" rel="noopener noreferrer" target="_blank"&gt;Monologue&lt;/a&gt;&lt;/u&gt;. Work on documents with AI agents using &lt;/em&gt;&lt;strong&gt;&lt;em&gt;&lt;u&gt;&lt;a href="https://www.proofeditor.ai/?source=post_button" rel="noopener noreferrer" target="_blank"&gt;Proof&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;&lt;/strong&gt;&lt;em&gt;.&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;For sponsorship opportunities, reach out to sponsorships@every.to.&lt;/em&gt;&lt;/p&gt;&lt;div class="quill-button" data-source="{&amp;quot;id&amp;quot;:&amp;quot;quill-button-1780745535354&amp;quot;,&amp;quot;text&amp;quot;:&amp;quot;Upgrade to paid&amp;quot;,&amp;quot;url&amp;quot;:&amp;quot;https://every.to/subscribe?source=post_button&amp;quot;}" id="quill-button-1780745535354"&gt;&lt;a href="https://every.to/subscribe?source=post_button"&gt;Upgrade to paid&lt;/a&gt;&lt;/div&gt;</description>
      <author>Every Staff</author>
      <pubDate>2026-06-06 15:00:00 -0400</pubDate>
      <guid>https://every.to/p/ai-is-ready-organizations-aren-t</guid>
      <link>https://every.to/p/ai-is-ready-organizations-aren-t</link>
    </item>
    <item>
      <title>How Microsoft Is Building for a World of Metered Intelligence </title>
      <description>&lt;table&gt;&lt;tr&gt;&lt;td&gt;&lt;img alt="Also True for Humans" src="https://d24ovhgu8s7341.cloudfront.net/uploads/publication/logo/95/small_ath.png" /&gt;&lt;/td&gt;&lt;td&gt;&lt;/td&gt;&lt;td&gt;&lt;table&gt;&lt;tr&gt;&lt;td&gt;by &lt;a href="https://every.to/@mike_2114" itemprop="name"&gt;Mike Taylor&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;in &lt;a href="https://every.to/also-true-for-humans"&gt;Also True for Humans&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;figure&gt;&lt;img src="https://d24ovhgu8s7341.cloudfront.net/uploads/post/cover/4290/full_page_cover_2ac417d80b197d1e-Friday_s_piece.jpg"&gt;&lt;figcaption&gt;Midjourney/Every illustration.&lt;/figcaption&gt;&lt;/figure&gt;&lt;p&gt;&lt;em&gt;Was this newsletter forwarded to you? &lt;u&gt;&lt;a href="https://every.to/account" rel="noopener noreferrer" target="_blank"&gt;Sign up&lt;/a&gt;&lt;/u&gt; to get it in your inbox.&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;As I rode in my Uber to Microsoft’s annual Build conference on Monday, I fondly recalled a time when you could get anywhere in San Francisco for $5. Those days are long gone. Venture capitalists lost their appetite to supply unlimited funding in a viciously competitive market, and Uber needed to show a path to profitability ahead of its 2019 IPO. &lt;/p&gt;&lt;p&gt;There are signs that the “$5 Uber era” of LLMs is over now, too. AI labs are subsidizing subscriptions &lt;u&gt;&lt;a href="https://www.forbes.com/sites/annatong/2026/03/05/cursor-goes-to-war-for-ai-coding-dominance/" rel="noopener noreferrer" target="_blank"&gt;to the tune of thousands of dollars&lt;/a&gt;&lt;/u&gt;, which can’t continue forever. This year Anthropic, OpenAI, and SpaceXAI are all going public—and like Uber seven years ago, they’ll need to take a hard look at their books. On June 1, the eve of the event, Microsoft sparked outrage by switching to token-based billing on GitHub Copilot. Some users said their bills jumped from &lt;u&gt;&lt;a href="https://x.com/edzitron/status/2060214903059956039" rel="noopener noreferrer" target="_blank"&gt;$39 to over $3,000&lt;/a&gt;&lt;/u&gt; &lt;u&gt;&lt;a href="https://x.com/edzitron/status/2060214903059956039" rel="noopener noreferrer" target="_blank"&gt;per month&lt;/a&gt;&lt;/u&gt;. &lt;/p&gt;&lt;p&gt;Rather than backtracking on billing, Microsoft used the conference stage in California to make the case for using AI more pragmatically in the face of rising costs. I came away from the event thinking that Microsoft is the first company to get real about a world where intelligence is available on tap, but constrained by how many coins you can put in the meter. Here is what the company’s vision looks like in practice, and what it might tell us about how we’ll be paying for and pricing AI in the future. &lt;/p&gt;&lt;h2&gt;Intelligence on and off the meter: A product approach&lt;/h2&gt;&lt;p&gt;In his opening speech, CEO &lt;strong&gt;Satya Nadella&lt;/strong&gt; addressed pricing concerns head-on. He promised “unmetered intelligence to every desk and every home,” an AI-era update to &lt;strong&gt;&lt;u&gt;&lt;a href="https://www.bbc.com/culture/article/20250327-how-bill-gates-predicted-our-it-age-back-in-1993" rel="noopener noreferrer" target="_blank"&gt;Bill Gates&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;&lt;u&gt;&lt;a href="https://www.bbc.com/culture/article/20250327-how-bill-gates-predicted-our-it-age-back-in-1993" rel="noopener noreferrer" target="_blank"&gt;’s vision&lt;/a&gt;&lt;/u&gt; of “a computer on every desk.” &lt;/p&gt;&lt;div class="quill-block-image" id="quill-block-image-1780671474424-93h2yc41v" data-source="{&amp;quot;dom_id&amp;quot;:&amp;quot;quill-block-image-1780671474424-93h2yc41v&amp;quot;,&amp;quot;link&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4290/optimized_c7d25851-8e76-48ae-ab7b-f2d167299dc2.jpeg&amp;quot;,&amp;quot;image&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4290/optimized_c7d25851-8e76-48ae-ab7b-f2d167299dc2.jpeg&amp;quot;,&amp;quot;caption&amp;quot;:&amp;quot;Microsoft CEO Satya Nadella promised “unmetered intelligence to every desk and every home.” (All images courtesy of Mike Taylor.)&amp;quot;,&amp;quot;error&amp;quot;:null}"&gt;&lt;div&gt;&lt;a href="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4290/optimized_c7d25851-8e76-48ae-ab7b-f2d167299dc2.jpeg" target="_blank" rel="noopener noreferrer"&gt;&lt;img src="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4290/optimized_c7d25851-8e76-48ae-ab7b-f2d167299dc2.jpeg" alt="Microsoft CEO Satya Nadella promised “unmetered intelligence to every desk and every home.” (All images courtesy of Mike Taylor.)"&gt;&lt;/a&gt;&lt;figcaption class="quill-image-caption"&gt;Microsoft CEO Satya Nadella promised “unmetered intelligence to every desk and every home.” (All images courtesy of Mike Taylor.)&lt;/figcaption&gt;&lt;/div&gt;&lt;/div&gt;&lt;p&gt;&lt;br&gt;&lt;/p&gt;&lt;p&gt;The most tangible way to experience that vision is with the RTX Spark, a new laptop Microsoft designed for AI workloads with Nvidia. The device is able to run a medium-sized 128-billion-parameter model locally (frontier models are in the trillions of parameters) so developers can get a lot of work done without paying a penny for tokens. Microsoft is taking advantage of the fact that the leading open-source models like &lt;u&gt;&lt;a href="https://www.kimi.com/ai-models/kimi-k2-6" rel="noopener noreferrer" target="_blank"&gt;Kimi-K2.6&lt;/a&gt;&lt;/u&gt;, which have a trillion parameters, are too big to fit on most laptops, and is betting that budget-conscious coders might not mind being a year or two behind the frontier and use a smaller model. The device will be released in the fall.&lt;/p&gt;&lt;p&gt;The RTX Spark laptop &lt;u&gt;&lt;a href="https://techcommunity.microsoft.com/blog/microsoftmechanicsblog/claude--gpt--multi-model-intelligence-in-copilot/4509773" rel="noopener noreferrer" target="_blank"&gt;follows earlier feature announcements&lt;/a&gt;&lt;/u&gt; that show that Microsoft wants to decrease switching costs for customers by being the place where you can use any model, agent, or harness. The laptop has a rebuilt smart terminal app that allows you to run any coding agent harness and has adopted popular terminal commands from the Mac ecosystem to make the shift easier for developers. &lt;/p&gt;&lt;p&gt;Even the GitHub Copilot Desktop app, also released at the conference, makes it easy to switch providers between OpenAI-built, Anthropic-built, and local open-source models running on your device. &lt;/p&gt;&lt;p&gt;When questioned about the affordability of agentic coding, &lt;strong&gt;Mario Rodriguez&lt;/strong&gt;, GitHub’s chief product officer, cited the automatic model routing feature in GitHub Copilot, which can delegate less complicated tasks to cheaper models. In my interview with &lt;strong&gt;Kyle Daigle&lt;/strong&gt;, GitHub’s chief operating officer, he lamented that developers tend to choose “the model of the day, or week, or hour,” even when the task doesn’t merit that kind of power. A person probably will not manually switch to a cheaper model for that final step, “but the tools could.” I’ve also &lt;u&gt;&lt;a href="https://every.to/also-true-for-humans/you-are-the-most-expensive-model" rel="noopener noreferrer" target="_blank"&gt;long argued&lt;/a&gt;&lt;/u&gt; that not every task needs to be done by a frontier model.&lt;/p&gt;&lt;p&gt;I get the feeling the team built this model router feature for themselves after facing the same problem everyone else is right now—Microsoft itself has been &lt;u&gt;&lt;a href="https://www.thestreet.com/technology/microsoft-ceo-sends-shocking-message-to-employees" rel="noopener noreferrer" target="_blank"&gt;cancelling Claude Code licenses&lt;/a&gt;&lt;/u&gt; to reduce costs. &lt;/p&gt;&lt;p&gt;Features like automatic model routing show that Microsoft understands how runaway costs hurt enterprises that need tighter control over spending. The AI labs won’t let large companies buy highly subsidized individual “Max” plans, so big companies end up paying full freight on every token they burn. One that wasn’t properly monitoring usage is rumored to have spent an eye-watering &lt;u&gt;&lt;a href="https://www.axios.com/2026/05/28/ai-spending-roi-enterprise-costs" rel="noopener noreferrer" target="_blank"&gt;half a billion dollars&lt;/a&gt;&lt;/u&gt; on Claude tokens in a single month. &lt;/p&gt;&lt;p&gt;That wasn’t the only news that day: Microsoft’s research lab, led by &lt;strong&gt;Mustafa Suleyman&lt;/strong&gt;, released &lt;u&gt;&lt;a href="https://microsoft.ai/news/building-a-hillclimbing-machine-launching-seven-new-mai-models/" rel="noopener noreferrer" target="_blank"&gt;a set of new (cheaper) smaller models&lt;/a&gt;&lt;/u&gt; spanning image, voice, transcription, coding, and reasoning. &lt;/p&gt;&lt;h2&gt;Tackling costs through model optimization &lt;/h2&gt;&lt;p&gt;But when you don’t use the latest models to save cost, there’s a higher risk of making a costly mistake. One answer was a phrase I heard over 100 times at the one-day event: “hill climbing.” This is the idea that you can set an evaluation metric for a task—a lookup to check your AI customer service bot is giving the right answers to common questions, for example—and then keep automatically testing new instructions until the smaller model gets an acceptable score. &lt;/p&gt;&lt;p&gt;This is the thesis behind optimization frameworks like &lt;u&gt;&lt;a href="https://dspy.ai/tutorials/gepa_ai_program/" rel="noopener noreferrer" target="_blank"&gt;GEPA in DSPy&lt;/a&gt;&lt;/u&gt;, &lt;strong&gt;&lt;u&gt;&lt;a href="https://github.com/karpathy/autoresearch" rel="noopener noreferrer" target="_blank"&gt;Andrej Karpathy&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;’s autoresearch, and &lt;u&gt;&lt;a href="https://developers.openai.com/codex/use-cases/follow-goals" rel="noopener noreferrer" target="_blank"&gt;Codex’s /goal&lt;/a&gt;&lt;/u&gt; feature. In this fashion, you can use a smart model to train a dumb one, a process called distillation, bringing down costs while maintaining reliability. In a podcast appearance recorded at the conference, Nadella called private eval benchmarks for use in hill climbing the &lt;u&gt;&lt;a href="https://x.com/hammer_mt/status/2061895002939347099?s=20" rel="noopener noreferrer" target="_blank"&gt;“greatest IP”&lt;/a&gt;&lt;/u&gt; that a company could have. &lt;/p&gt;&lt;p&gt;Mistakes will still be made, even with frontier models, so Microsoft also focused on security. To make AI less likely to make costly mistakes, the company released &lt;u&gt;&lt;a href="https://venturebeat.com/security/microsoft-launches-mxc-an-os-level-sandbox-for-ai-agents-with-openai-and-nvidia-already-on-board" rel="noopener noreferrer" target="_blank"&gt;MXC, or Microsoft eXection Containers&lt;/a&gt;&lt;/u&gt;, an operating system-level sandbox designed to securely contain untrusted code, plugins, and autonomous AI agents. In a surprise appearance, &lt;u&gt;&lt;a href="https://every.to/guides/claw-school" rel="noopener noreferrer" target="_blank"&gt;OpenClaw&lt;/a&gt;&lt;/u&gt; creator &lt;strong&gt;Peter Steinberger&lt;/strong&gt; appeared on stage during a demo in which a team instructed their agent to delete all their files on their computer, only to be thwarted by the protections their IT department had put in place through MXC. The message was: “&lt;u&gt;&lt;a href="https://x.com/hammer_mt/status/2061867937150181759?s=20" rel="noopener noreferrer" target="_blank"&gt;OpenClaw is safe for work now&lt;/a&gt;&lt;/u&gt;.” &lt;/p&gt;&lt;p&gt;To prove the point, Microsoft launched &lt;u&gt;&lt;a href="https://x.com/hammer_mt/status/2061874731624927595?s=20" rel="noopener noreferrer" target="_blank"&gt;Autopilot&lt;/a&gt;&lt;/u&gt;, its take on hosted long-running agents, with the first (of many) agents &lt;u&gt;&lt;a href="https://www.microsoft.com/en-us/microsoft-365/blog/2026/06/02/introducing-microsoft-scout-your-always-on-personal-agent/" rel="noopener noreferrer" target="_blank"&gt;Scout&lt;/a&gt;&lt;/u&gt; inspired by internal testing and experimentation. Autopilot runs agent frameworks like OpenClaw and Hermes Agent, but is hosted in a secure Microsoft environment, with access to all of the context available in your documents and applications. Executives also pointed to MDash, their multi-agent code review system, which, according to Nadella, caught bugs that even Anthropic’s &lt;u&gt;&lt;a href="https://techcrunch.com/2026/06/02/anthropic-scales-claude-mythos-to-critical-infrastructure-in-15-countries/" rel="noopener noreferrer" target="_blank"&gt;Mythos&lt;/a&gt;&lt;/u&gt; model missed.&lt;/p&gt;&lt;h2&gt;What Silicon Valley is missing&lt;/h2&gt;&lt;p&gt;While many enterprise clients are trying to manage costs or struggling to measure return on investment, there will always be a thirst for the most expensive and highly performing frontier models among AI-pilled developers. And for those, we will need data centers. Nadella said the company will keep up &lt;u&gt;&lt;a href="https://benzatine.com/news-room/microsofts-satya-nadella-addresses-community-concerns-over-ai-data-centers" rel="noopener noreferrer" target="_blank"&gt;its brisk pace of building&lt;/a&gt;&lt;/u&gt;, yet also acknowledged the societal cost better than any other leader I’ve seen in the space. &lt;/p&gt;&lt;p&gt;Tech Insider reported that half of announced U.S. data center capacity for 2026 &lt;u&gt;&lt;a href="https://tech-insider.org/us-ai-data-center-delays-cancellations-7gw-capacity-crisis-2026/" rel="noopener noreferrer" target="_blank"&gt;has been cancelled or delayed&lt;/a&gt;&lt;/u&gt;, in part due to concerns about rising electricity prices and water usage. Nadella framed continued construction as something the tech industry needs to earn permission for, by committing to keeping electricity and water usage self-contained and by providing jobs and opportunities for local residents. A small protest against data centers formed at the entrance of the conference, and at one point I looked up and saw an airplane trailing a sign with the same anti-data center message.&lt;/p&gt;&lt;div class="quill-block-image" id="quill-block-image-1780671474429-2qfijkub9" data-source="{&amp;quot;dom_id&amp;quot;:&amp;quot;quill-block-image-1780671474429-2qfijkub9&amp;quot;,&amp;quot;link&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4290/optimized_60e0ee82-57c2-44cf-b6c1-5a6896073bdd.jpeg&amp;quot;,&amp;quot;image&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4290/optimized_60e0ee82-57c2-44cf-b6c1-5a6896073bdd.jpeg&amp;quot;,&amp;quot;caption&amp;quot;:&amp;quot;Signs made by protestors of data center construction outside the conference.&amp;quot;,&amp;quot;error&amp;quot;:null}"&gt;&lt;div&gt;&lt;a href="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4290/optimized_60e0ee82-57c2-44cf-b6c1-5a6896073bdd.jpeg" target="_blank" rel="noopener noreferrer"&gt;&lt;img src="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4290/optimized_60e0ee82-57c2-44cf-b6c1-5a6896073bdd.jpeg" alt="Signs made by protestors of data center construction outside the conference."&gt;&lt;/a&gt;&lt;figcaption class="quill-image-caption"&gt;Signs made by protestors of data center construction outside the conference.&lt;/figcaption&gt;&lt;/div&gt;&lt;/div&gt;&lt;p&gt;&lt;br&gt;&lt;/p&gt;&lt;p&gt;Microsoft Build is normally held in Seattle, and hosting it in San Francisco allowed for a stark contrast between the &lt;u&gt;&lt;a href="https://en.wikipedia.org/wiki/Token_maxxing" rel="noopener noreferrer" target="_blank"&gt;token maxxing&lt;/a&gt;&lt;/u&gt; AI-pilled engineers and the relentlessly pragmatic enterprise leaders who are trying to get this technology to work in their companies. Richly compensated AI researchers spending &lt;u&gt;&lt;a href="https://www.businessinsider.com/sam-altman-openai-top-token-spender-ai-costs-issue-2026-6" rel="noopener noreferrer" target="_blank"&gt;hundreds of billions of free tokens per month&lt;/a&gt;&lt;/u&gt; aren’t living in the same world as a junior IT consultant for an enterprise healthcare company in Seattle––and that person swears by Microsoft’s products. &lt;/p&gt;&lt;p&gt;As I rode my Uber back to the airport at the end of the conference, I read a story about how Uber had &lt;u&gt;&lt;a href="https://simonwillison.net/2026/Jun/3/uber-caps-usage/" rel="noopener noreferrer" target="_blank"&gt;capped its engineers’ token budget&lt;/a&gt;&lt;/u&gt; at a sensible $1,500 per month. If this represents about 10 percent of a typical Uber engineer’s salary, managers expecting to improve productivity by 10 times will make up the difference through more pragmatic usage.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;strong&gt;&lt;em&gt;&lt;u&gt;&lt;a href="https://every.to/@mike_2114" rel="noopener noreferrer" target="_blank"&gt;Mike Taylor&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;&lt;/strong&gt; &lt;em&gt;is the head of tech consulting at Every and a co-author of &lt;/em&gt;&lt;u&gt;&lt;a href="https://www.oreilly.com/library/view/prompt-engineering-for/9781098153427/" rel="noopener noreferrer" target="_blank"&gt;Prompt Engineering for Generative AI&lt;/a&gt;&lt;/u&gt; (O’Reilly)&lt;em&gt;. To read more essays like this, subscribe to &lt;u&gt;&lt;a href="https://every.to/subscribe" rel="noopener noreferrer" target="_blank"&gt;Every&lt;/a&gt;&lt;/u&gt;, and follow us on X at &lt;u&gt;&lt;a href="http://twitter.com/every" rel="noopener noreferrer" target="_blank"&gt;@every&lt;/a&gt;&lt;/u&gt; and on &lt;u&gt;&lt;a href="https://www.linkedin.com/company/everyinc/" rel="noopener noreferrer" target="_blank"&gt;LinkedIn&lt;/a&gt;&lt;/u&gt;.&lt;/em&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;For sponsorship opportunities, reach out to sponsorships@every.to.&lt;/em&gt;&lt;/p&gt;&lt;div class="quill-button" data-source="{&amp;quot;id&amp;quot;:&amp;quot;quill-button-1769187301610&amp;quot;,&amp;quot;url&amp;quot;:&amp;quot;https://every.to/subscribe?source=post_button&amp;quot;,&amp;quot;text&amp;quot;:&amp;quot;Subscribe&amp;quot;}" id="quill-button-1769187301610"&gt;&lt;a href="https://every.to/subscribe?source=post_button"&gt;Subscribe&lt;/a&gt;&lt;/div&gt;</description>
      <author>Mike Taylor / Also True for Humans</author>
      <pubDate>2026-06-05 13:00:00 -0400</pubDate>
      <guid>https://every.to/also-true-for-humans/how-microsoft-is-building-for-a-world-of-metered-intelligence</guid>
      <link>https://every.to/also-true-for-humans/how-microsoft-is-building-for-a-world-of-metered-intelligence</link>
    </item>
    <item>
      <title>Why We’ll Still Be Employed When AI Can Do Everything</title>
      <description>&lt;table&gt;&lt;tr&gt;&lt;td&gt;&lt;img alt="Context Window" src="https://d24ovhgu8s7341.cloudfront.net/uploads/publication/logo/94/small_context_windown_1.png" /&gt;&lt;/td&gt;&lt;td&gt;&lt;/td&gt;&lt;td&gt;&lt;table&gt;&lt;tr&gt;&lt;td&gt;by &lt;a href="https://every.to/@laura_27bbaf_1" itemprop="name"&gt;Laura Entis&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;in &lt;a href="https://every.to/context-window"&gt;Context Window&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;figure&gt;&lt;img src="https://d24ovhgu8s7341.cloudfront.net/uploads/post/cover/4289/full_page_cover_64f2c64482f2dd55-People_working_2.png"&gt;&lt;figcaption&gt;Midjourney/Every illustration.&lt;/figcaption&gt;&lt;/figure&gt;&lt;p&gt;&lt;em&gt;Was this newsletter forwarded to you? &lt;u&gt;&lt;a href="https://every.to/account" rel="noopener noreferrer" target="_blank"&gt;Sign up&lt;/a&gt;&lt;/u&gt; to get it in your inbox.&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;h3&gt;Launch&lt;/h3&gt;&lt;h4&gt;Spiral 4.0&lt;/h4&gt;&lt;p&gt;Today we’re &lt;u&gt;&lt;a href="https://every.to/on-every/spiral-4-0-goes-agent-native" rel="noopener noreferrer" target="_blank"&gt;launching &lt;/a&gt;&lt;/u&gt;&lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/on-every/spiral-4-0-goes-agent-native" rel="noopener noreferrer" target="_blank"&gt;Spiral&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;&lt;u&gt;&lt;a href="https://every.to/on-every/spiral-4-0-goes-agent-native" rel="noopener noreferrer" target="_blank"&gt; 4.0&lt;/a&gt;&lt;/u&gt;, which writes drafts in your voice from idea to line edit. Spiral has a new MCP alongside the existing CLI and API, so any agent or workflow can write in your voice too. For teams, we’ve expanded workspaces, which let you share styles, prompts, knowledge—and now chats and drafts. Finally, Spiral has a new pricing model: We’ve switched from session limits to token limits, so costs match your actual usage rather than how many times you opened a new chat. A vast majority of users will end up paying less: Personal plans now start at $15 a month—down from $25—and team plans are $25 per user, down from $35.&lt;/p&gt;&lt;div class="quill-button" data-source="{&amp;quot;id&amp;quot;:&amp;quot;quill-button-1780599476128&amp;quot;,&amp;quot;text&amp;quot;:&amp;quot;Try Spiral 4.0&amp;quot;,&amp;quot;url&amp;quot;:&amp;quot;https://writewithspiral.com/?source=post_button&amp;quot;}" id="quill-button-1780599476128"&gt;&lt;a href="https://writewithspiral.com/?source=post_button"&gt;Try Spiral 4.0&lt;/a&gt;&lt;/div&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;h3&gt;Signal&lt;/h3&gt;&lt;h4&gt;Enterprise AI product roadmaps are hard&lt;/h4&gt;&lt;p&gt;Microsoft is moving fast. Three months after OpenClaw came out in November 2025, Microsoft CEO &lt;strong&gt;Satya Nadella&lt;/strong&gt; described it as &lt;u&gt;&lt;a href="https://www.constellationr.com/insights/news/microsoft-ceo-nadella-ai-efficiency-drive-deck" rel="noopener noreferrer" target="_blank"&gt;a “virus”-like security risk&lt;/a&gt;&lt;/u&gt;. By May, the company’s “Project Lobster” was &lt;u&gt;&lt;a href="https://www.geekwire.com/2026/microsofts-openclaw-team-takes-on-the-personal-assistant-challenge/" rel="noopener noreferrer" target="_blank"&gt;internally testing “ClawPilot,”&lt;/a&gt;&lt;/u&gt; an OpenClaw-based desktop environment. This week at the Microsoft Build conference, the company released &lt;u&gt;&lt;a href="https://www.microsoft.com/en-us/microsoft-365/blog/2026/06/02/introducing-microsoft-scout-your-always-on-personal-agent/" rel="noopener noreferrer" target="_blank"&gt;Scout&lt;/a&gt;&lt;/u&gt;, a personal agent for work built on OpenClaw. For a company employing 100,000 engineers, this is blindingly fast. Unfortunately, it may already be too late.&lt;/p&gt;&lt;div class="quill-block-image" id="quill-block-image-1780599398627-1rl1yqnxz" data-source="{&amp;quot;dom_id&amp;quot;:&amp;quot;quill-block-image-1780599398627-1rl1yqnxz&amp;quot;,&amp;quot;link&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4289/optimized_3ba19fb1-c5bc-4371-8501-2705c4c29124.png&amp;quot;,&amp;quot;image&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4289/optimized_3ba19fb1-c5bc-4371-8501-2705c4c29124.png&amp;quot;,&amp;quot;caption&amp;quot;:&amp;quot;The Google Trends graph for the term “openclaw” shows search interest spiked in January and began its descent soon after. (Screenshot courtesy of Mike Taylor.)&amp;quot;,&amp;quot;error&amp;quot;:null}"&gt;&lt;div&gt;&lt;a href="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4289/optimized_3ba19fb1-c5bc-4371-8501-2705c4c29124.png" target="_blank" rel="noopener noreferrer"&gt;&lt;img src="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4289/optimized_3ba19fb1-c5bc-4371-8501-2705c4c29124.png" alt="The Google Trends graph for the term “openclaw” shows search interest spiked in January and began its descent soon after. (Screenshot courtesy of Mike Taylor.)"&gt;&lt;/a&gt;&lt;figcaption class="quill-image-caption"&gt;The Google Trends graph for the term “openclaw” shows search interest spiked in January and began its descent soon after. (Screenshot courtesy of Mike Taylor.)&lt;/figcaption&gt;&lt;/div&gt;&lt;/div&gt;&lt;p&gt;&lt;br&gt;&lt;/p&gt;&lt;p&gt;OpenClaw search traffic spiked in early January, after everyone had a chance to experiment with Opus 4.5 over the holidays. The sharp rise in interest died down almost as quickly as it took off, helped along in early April by Anthropic ending support for &lt;u&gt;&lt;a href="https://thenextweb.com/news/anthropic-openclaw-claude-subscription-ban-cost" rel="noopener noreferrer" target="_blank"&gt;subsidized Max plan usage&lt;/a&gt;&lt;/u&gt;—thereby forcing everyone to scramble to get OpenClaw working on cheaper models.&lt;/p&gt;&lt;p&gt;This doesn’t mean OpenClaw is dead; the open-source project saw a recent &lt;u&gt;&lt;a href="https://x.com/steipete/status/2062276065448669627?s=20" rel="noopener noreferrer" target="_blank"&gt;uptick in download&lt;/a&gt;&lt;/u&gt; and is still under active development, with millions of dollars of patronage from OpenAI, which hired its creator &lt;strong&gt;Peter Steinberger&lt;/strong&gt;. AI agents as a category aren’t dead, either, as traffic has moved to other agents like Hermes, Google has just rolled out &lt;u&gt;&lt;a href="https://gemini.google/overview/agent/spark/" rel="noopener noreferrer" target="_blank"&gt;Gemini Spark&lt;/a&gt;&lt;/u&gt; (first announced last month at its &lt;u&gt;&lt;a href="https://every.to/playtesting/notes-from-the-foothills-of-the-singularity" rel="noopener noreferrer" target="_blank"&gt;I/O developer conference&lt;/a&gt;&lt;/u&gt;), and Claude and Codex &lt;u&gt;&lt;a href="https://every.to/chain-of-thought/inside-anthropic-s-2026-developer-conference" rel="noopener noreferrer" target="_blank"&gt;have both adopted&lt;/a&gt;&lt;/u&gt; more agentic features inspired by OpenClaw. &lt;/p&gt;&lt;p&gt;That said, it must be tough to manage enterprise AI product roadmaps these days. You do everything right, watch the latest trends, pivot your focus to supporting new tools and making them secure in enterprise environments. You move mountains to explain to stakeholders why this is a good idea. You plan the keynote of your big conference, which has to be scheduled months in advance. Then a month after the internal beta (just three months since the tool went viral), you’re already behind the news cycle. Everyone has moved onto the next shiny thing. You go back to the drawing board and think “maybe next time, we’ll just announce it on X.”—&lt;em&gt;&lt;u&gt;&lt;a href="https://every.to/@mike_2114" rel="noopener noreferrer" target="_blank"&gt;Mike Taylor&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;h2&gt;Log on&lt;/h2&gt;&lt;p&gt;Get hands-on with how Every uses AI. These are the &lt;u&gt;&lt;a href="https://every.to/events" rel="noopener noreferrer" target="_blank"&gt;live camps, workshops, and meetups&lt;/a&gt;&lt;/u&gt; where team members teach the workflows behind our work.&lt;/p&gt;&lt;h4&gt;Upcoming camp&lt;/h4&gt;&lt;ul&gt;&lt;li&gt;&lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/events/compound-engineering-camp-3" rel="noopener noreferrer" target="_blank"&gt;Compound Engineering Camp&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;: On June 5, &lt;strong&gt;&lt;u&gt;&lt;a href="https://cora.computer" rel="noopener noreferrer" target="_blank"&gt;Cora&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; general manager&lt;strong&gt; &lt;u&gt;&lt;a href="https://every.to/@kieran_1355" rel="noopener noreferrer" target="_blank"&gt;Kieran Klaassen&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; and Trevin Chow host a one-hour walkthrough of compound engineering, the AI-native development workflow Every uses to ship products. &lt;u&gt;&lt;a href="https://every.to/events/compound-engineering-camp-3" rel="noopener noreferrer" target="_blank"&gt;Learn more and register&lt;/a&gt;&lt;/u&gt;.&lt;/li&gt;&lt;li&gt;&lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/events/codex-camp-our-power-user-guide" rel="noopener noreferrer" target="_blank"&gt;Codex Camp: Our Power User Guide&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;: On June 12, &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@danshipper" rel="noopener noreferrer" target="_blank"&gt;Dan Shipper&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; and the Every team host a two-hour live walkthrough of the Codex power-user guide—setup, workflows, and Codex-native app development. &lt;u&gt;&lt;a href="https://every.to/events/codex-camp-our-power-user-guide" rel="noopener noreferrer" target="_blank"&gt;Learn more and register&lt;/a&gt;&lt;/u&gt;.&lt;/li&gt;&lt;/ul&gt;&lt;h3&gt;&lt;hr class="quill-line"&gt;&lt;/h3&gt;&lt;h2&gt;Steal this workflow&lt;/h2&gt;&lt;h4&gt;Make your agent more efficient with custom skills&lt;/h4&gt;&lt;p&gt;These days, &lt;strong&gt;&lt;u&gt;&lt;a href="https://www.monologue.to/?utm_source=everywebsite" rel="noopener noreferrer" target="_blank"&gt;Monologue&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;’s general manager &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@naveen_6804" rel="noopener noreferrer" target="_blank"&gt;Naveen Naidu&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; spends &lt;u&gt;&lt;a href="https://every.to/context-window/compute-is-the-new-cash#steak-this-workflow" rel="noopener noreferrer" target="_blank"&gt;most of his time&lt;/a&gt;&lt;/u&gt; in the &lt;u&gt;&lt;a href="https://every.to/p/how-to-use-codex-for-knowledge-work-a-power-user-s-guide" rel="noopener noreferrer" target="_blank"&gt;Codex app&lt;/a&gt;&lt;/u&gt; with Fin—formerly Intercom, a customer support platform—open in the coding agent’s in-app browser. Working from a repository-local project, he has Codex investigate the customer issue displayed in the browser, create a bug report in Linear, link the Intercom ticket to the Linear issue, and draft a reply to the customer with information about the bug report—all without having to leave the app. &lt;/p&gt;&lt;p&gt;Fin has an MCP with 13 common actions, like searching conversations or reading and writing messages. Naveen’s workflow required a more specific one: Turn the active Fin conversation into a markdown file the coding agent could read.&lt;/p&gt;&lt;p&gt;Here’s Naveen’s workflow for creating a more focused setup:&lt;/p&gt;&lt;h5&gt;&lt;strong&gt;1. Ask your agent how to make a repeated task more efficient&lt;/strong&gt;&lt;/h5&gt;&lt;p&gt;Naveen’s prompt for Codex was simple: “What tools can I give you so you can work more quickly?” He reviewed its suggestions, and landed on creating a custom, dedicated Fin script instead of trying to convert a webpage into a markdown file or rely on Fin’s MCP, which is designed for more generic workflows. &lt;/p&gt;&lt;h5&gt;&lt;strong&gt;2. Build the most focused local skill possible for the task at hand&lt;/strong&gt;&lt;/h5&gt;&lt;p&gt;To build the tool, Naveen directed Codex to Fin’s &lt;u&gt;&lt;a href="https://developers.intercom.com/docs/references/rest-api/api.intercom.io" rel="noopener noreferrer" target="_blank"&gt;API documentation&lt;/a&gt;&lt;/u&gt; and asked it to create a repository-local skill. The skill included a small command-line script that calls the API, pulls the active conversation, and hands it back to Codex as a markdown file.&lt;/p&gt;&lt;h5&gt;&lt;strong&gt;3. Tell your agent when to use the skill&lt;/strong&gt;&lt;/h5&gt;&lt;p&gt;Once he’d built his custom skill, Naveen added a project-level instruction: If context on a customer issue is missing, check the active in-app browser, identify the Fin conversation, and use the custom skill to pull the thread and convert it into a markdown file. That lets him ask, “Can you give me user details for this issue?” without pasting the conversation or explaining which customer he means.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Try it this week:&lt;/strong&gt; When your agent takes too long on a repeated task, ask: “What script or skill could I give you so you aren’t spending so much time on this?”&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Naveen’s rule of thumb: &lt;/strong&gt;“Don’t download any skills. Start interacting with the agent, see where it is inefficient, and then ask it to create skills.”&lt;/p&gt;&lt;h3&gt;&lt;hr class="quill-line"&gt;&lt;/h3&gt;&lt;h2&gt;Counterpoint&lt;/h2&gt;&lt;h4&gt;AI will outpace human ability, but it won’t be cheap&lt;/h4&gt;&lt;p&gt;In &lt;u&gt;&lt;a href="https://every.to/p/after-automation" rel="noopener noreferrer" target="_blank"&gt;“After Automation,”&lt;/a&gt;&lt;/u&gt; Dan argues that AI progress creates more work for humans, not less. Each time the models saturate a benchmark—and make yesterday’s human competence cheap in the process—we reset the frame. The model then saturates that frame too, we reset the frame once more, and the cycle repeats—forever. The frame, Dan says, is never the framer.&lt;/p&gt;&lt;p&gt;If Every were a normal company, I’d hesitate to publicly disagree with my CEO. It isn’t, so here goes: I don’t think the “forever” part holds up.&lt;/p&gt;&lt;p&gt;The dynamic Dan describes matches my experience. A year ago, I wrote prompts until the model got better at generating them. Then I became the one supplying context until the model bested me at that, too. Today I spend my time orchestrating agents and determining what “good” outputs look like. Each time AI absorbs a piece of my job, the frame expands to include more abstract, &lt;u&gt;&lt;a href="https://every.to/guides/the-eight-levels-of-ai-adoption" rel="noopener noreferrer" target="_blank"&gt;higher-level&lt;/a&gt;&lt;/u&gt; work.&lt;/p&gt;&lt;p&gt;But I don’t think this progression will last forever. My prediction is that in a year or two, in a few well-run companies, AI will be able to execute every knowledge-worker task better than humans can—including setting the frames. In my role, I expect to be attending meetings to gather context that doesn’t exist online. The other parts of my job––defining evals, deciding goals, running experiments––will be handled by the equivalent of Opus 6 or GPT-7.&lt;/p&gt;&lt;p&gt;Why am I confident AI is capable of taking this last step? Because framing isn’t magic. We don’t pull goals out of thin air; we derive them from the layered experience of being a person in the world and the bounds of our social and physical surroundings. Physics is the ultimate eval metric, because if you get it wrong you die. Human ability feels like the natural peg for meaning, but we’re just one form intelligence can take. AI is another, and a system that learns from its environment can eventually run the same loop.&lt;/p&gt;&lt;p&gt;Intelligence costs energy, however, and I suspect evolution already made all the right tradeoffs to make us as smart as possible for our environment given constrained resources. For situations where there isn’t enough training data, a human runs on intuition and gut—words that describe a brain evolved to use thinking shortcuts, or heuristics, to survive. A model doesn’t inherit DNA encoded with millions of years of evolution, so it has to brute-force its way there through an expensive series of simulations or “thinking” tokens to get enough data to decide. There are no free lunches in economics, and AI isn’t magic—it can’t get to super-human general intelligence without super-human energy consumption. Beating humans on more subjective tasks will require more thinking tokens than its worth. Just hire the human.&lt;/p&gt;&lt;p&gt;The question will evolve from, “Can AI do this?” to, “Is it worth the compute?” or, alternatively, “Do I really &lt;em&gt;want&lt;/em&gt; an AI doing this for me?” It makes sense to delegate tasks to a $20 a month model, or a $200 a month model, but as the &lt;u&gt;&lt;a href="https://every.to/context-window/compute-is-the-new-cash#signal" rel="noopener noreferrer" target="_blank"&gt;“jagged free lunch”&lt;/a&gt;&lt;/u&gt; ends, is it worth paying $2,000 a month to make slide decks, check your email, and vibe code product prototypes? If we had a $20,000 a month Ph.D.-level model, wouldn’t it make more sense to have it fully dedicated to finding cures for cancer? We are already seeing people make these tradeoffs. Waymo is an &lt;u&gt;&lt;a href="https://www.reinsurancene.ws/waymo-shows-90-fewer-claims-than-advanced-human-driven-vehicles-swiss-re/" rel="noopener noreferrer" target="_blank"&gt;objectively safer driver&lt;/a&gt;&lt;/u&gt; than humans, yet riders pay &lt;u&gt;&lt;a href="https://www.documentcloud.org/documents/25973106-obi-waymo-61125/" rel="noopener noreferrer" target="_blank"&gt;one-third or more&lt;/a&gt;&lt;/u&gt; the price of equivalent Lyft and Uber rides.. AGI for driving has arrived, and the city’s taxi-and-rideshare workforce &lt;u&gt;&lt;a href="https://thelastdriverlicenseholder.com/2025/11/02/unexpected-effects-of-robotaxis-uber-and-lyft-are-also-growing/" rel="noopener noreferrer" target="_blank"&gt;grew anyway&lt;/a&gt;&lt;/u&gt;.&lt;/p&gt;&lt;p&gt;Dan believes humans will always stay one step ahead of the models. My prediction is the models will outpace us in raw capability, but we will stay employed anyway. Even if AI can do anything we can do better, some people (or agents) will still prefer human work. Especially if we can do it for less.—&lt;em&gt;MT&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;h2&gt;One last thing&lt;/h2&gt;&lt;p&gt;Spend enough time working with AI, and you’ll notice the specific linguistic mannerisms the models cannot quit—even if you explicitly tell them to stop. (Threats don’t work, either.) &lt;/p&gt;&lt;p&gt;OpenAI discovered just how hard it is to get a model to give up its preferred verbal and conversational tics when it tried—and to this day, seems to have failed—to get GPT-5.5 to ease up on the &lt;u&gt;&lt;a href="https://openai.com/index/where-the-goblins-came-from/" rel="noopener noreferrer" target="_blank"&gt;goblin references&lt;/a&gt;&lt;/u&gt;.&lt;/p&gt;&lt;p&gt;Here at Every, we all have our personal goblin equivalents:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;&lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@natalia_2944" rel="noopener noreferrer" target="_blank"&gt;Natalia Quintero&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;, head of consulting: Claude’s penchant for saying it’s “‘locked in” and “load bearing.”&lt;/li&gt;&lt;li&gt;&lt;strong&gt;Lee Knowlton, &lt;/strong&gt;software engineer&lt;strong&gt;: &lt;/strong&gt;“It&lt;strong&gt; &lt;/strong&gt;keeps telling me I have ’sharp’ takes, and who am I to disagree.”&lt;/li&gt;&lt;li&gt;Dan Shipper, CEO: Codex’s love of the phrase “my instinct is” and presenting itself as doing “‘X smart thing rather than Y dumb thing,’ but Y dumb thing was never in the consideration set.”&lt;/li&gt;&lt;li&gt;&lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@tedescau" rel="noopener noreferrer" target="_blank"&gt;Austin Tedesco&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;, head of growth: “Codex is always warning me to be less mean. Whenever I ask it for help with a piece of creative writing that has a joke I find funny but might come somewhat at someone or something else’s expense—like saying where a restaurant fell short—it always gives a note that I should soften or cut. Every time.”&lt;/li&gt;&lt;li&gt;&lt;strong&gt;Jalaiyah Bolden&lt;/strong&gt;, executive operations manager: Claude’s overuse of “Got it” and its insistence that Jalaiyah “get some rest!”&lt;/li&gt;&lt;li&gt;&lt;strong&gt;Paridhi Agarwal&lt;/strong&gt;, engineer: “Claude keeps asking me if I want to ‘leave it here for now and pick it back up in the morning’” (a conversational move Paridhi’s convinced is motivated by its desire “to maintain a smaller context window.”)&lt;/li&gt;&lt;li&gt;&lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@katie.parrott12" rel="noopener noreferrer" target="_blank"&gt;Katie Parrott&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;, staff writer: “If a model tells me something ‘matters’ or ‘is real’ I’m going to lose it.”&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;&lt;strong&gt;&lt;em&gt;&lt;a href="https://every.to/@laura_27bbaf_1" rel="noopener noreferrer" target="_blank"&gt;Laura Entis&lt;/a&gt;&lt;/em&gt;&lt;/strong&gt;&lt;em&gt; is a staff writer at Every. You can follow her on &lt;a href="https://www.linkedin.com/in/lauraentis/" rel="noopener noreferrer" target="_blank"&gt;LinkedIn&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;To read more essays like this, subscribe to &lt;u&gt;&lt;a href="https://every.to/subscribe" rel="noopener noreferrer" target="_blank"&gt;Every&lt;/a&gt;&lt;/u&gt;, and follow us on X at &lt;u&gt;&lt;a href="http://twitter.com/every" rel="noopener noreferrer" target="_blank"&gt;@every&lt;/a&gt;&lt;/u&gt; and on &lt;u&gt;&lt;a href="https://www.linkedin.com/company/everyinc/" rel="noopener noreferrer" target="_blank"&gt;LinkedIn&lt;/a&gt;&lt;/u&gt;.&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;We &lt;u&gt;&lt;a href="https://every.to/studio" rel="noopener noreferrer" target="_blank"&gt;build AI tools&lt;/a&gt;&lt;/u&gt; for readers like you. Write brilliantly with &lt;/em&gt;&lt;strong&gt;&lt;em&gt;&lt;u&gt;&lt;a href="https://writewithspiral.com/" rel="noopener noreferrer" target="_blank"&gt;Spiral&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;&lt;/strong&gt;&lt;em&gt;. Organize files automatically with &lt;/em&gt;&lt;strong&gt;&lt;em&gt;&lt;u&gt;&lt;a href="https://makeitsparkle.co/?utm_source=everyfooter" rel="noopener noreferrer" target="_blank"&gt;Sparkle&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;&lt;/strong&gt;&lt;em&gt;. Deliver yourself from email with &lt;/em&gt;&lt;strong&gt;&lt;em&gt;&lt;u&gt;&lt;a href="https://cora.computer/" rel="noopener noreferrer" target="_blank"&gt;Cora&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;&lt;/strong&gt;&lt;em&gt;. Dictate effortlessly with &lt;/em&gt;&lt;strong&gt;&lt;em&gt;&lt;u&gt;&lt;a href="https://monologue.to/" rel="noopener noreferrer" target="_blank"&gt;Monologue&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;&lt;/strong&gt;&lt;em&gt;. Collaborate with agents on documents with &lt;/em&gt;&lt;strong&gt;&lt;em&gt;&lt;a href="https://www.proofeditor.ai/" rel="noopener noreferrer" target="_blank"&gt;Proof&lt;/a&gt;&lt;/em&gt;&lt;/strong&gt;&lt;em&gt;. &lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;For sponsorship opportunities, reach out to sponsorships@every.to.&lt;/em&gt;&lt;/p&gt;&lt;div class="quill-button" data-source="{&amp;quot;id&amp;quot;:&amp;quot;quill-button-1769187301610&amp;quot;,&amp;quot;url&amp;quot;:&amp;quot;https://every.to/subscribe?source=post_button&amp;quot;,&amp;quot;text&amp;quot;:&amp;quot;Subscribe&amp;quot;}" id="quill-button-1769187301610"&gt;&lt;a href="https://every.to/subscribe?source=post_button"&gt;Subscribe&lt;/a&gt;&lt;/div&gt;</description>
      <author>Laura Entis / Context Window</author>
      <pubDate>2026-06-04 14:00:00 -0400</pubDate>
      <guid>https://every.to/context-window/why-we-ll-still-be-employed-when-ai-can-do-everything</guid>
      <link>https://every.to/context-window/why-we-ll-still-be-employed-when-ai-can-do-everything</link>
    </item>
    <item>
      <title>Spiral 4.0 Goes Agent-native</title>
      <description>&lt;table&gt;&lt;tr&gt;&lt;td&gt;&lt;img alt="On Every" src="https://d24ovhgu8s7341.cloudfront.net/uploads/publication/logo/17/small_Frame_216-2.png" /&gt;&lt;/td&gt;&lt;td&gt;&lt;/td&gt;&lt;td&gt;&lt;table&gt;&lt;tr&gt;&lt;td&gt;by &lt;a href="https://every.to/@marcus_fd8302_1" itemprop="name"&gt;Marcus Moretti&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;in &lt;a href="https://every.to/on-every"&gt;On Every&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;figure&gt;&lt;img src="https://d24ovhgu8s7341.cloudfront.net/uploads/post/cover/4288/full_page_cover_140dafced386acb8-cover_spiral_2.png"&gt;&lt;figcaption&gt;Figma/Midjourney/Every illustration.&lt;/figcaption&gt;&lt;/figure&gt;&lt;p&gt;&lt;em&gt;TL;DR: &lt;/em&gt;&lt;strong&gt;&lt;em&gt;&lt;a href="https://writewithspiral.com/" rel="noopener noreferrer" target="_blank"&gt;Spiral&lt;/a&gt;&lt;/em&gt;&lt;/strong&gt;&lt;em&gt; v4 just shipped with four major updates: a style engine that generates writing indistinguishable from your own 87 percent of the time, agent-native access via MCP, CLI, and API, team workspaces for writing in a shared voice, and a $10 price drop, bringing personal plans to start at $15 a month. Spiral will continue to be free for &lt;u&gt;&lt;a href="https://every.to/subscribe" rel="noopener noreferrer" target="_blank"&gt;paid Every subscribers&lt;/a&gt;&lt;/u&gt; along with access to all our tools and content.&lt;/em&gt;&lt;/p&gt;&lt;div class="quill-button" data-source="{&amp;quot;id&amp;quot;:&amp;quot;quill-button-1780582748207&amp;quot;,&amp;quot;text&amp;quot;:&amp;quot;Try Spiral 4.0&amp;quot;,&amp;quot;url&amp;quot;:&amp;quot;https://writewithspiral.com/?source=post_button&amp;quot;}" id="quill-button-1780582748207"&gt;&lt;a href="https://writewithspiral.com/?source=post_button"&gt;Try Spiral 4.0&lt;/a&gt;&lt;/div&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;Today we’re announcing a number of updates to Spiral, the writing partner for you and your agent. Spiral is built by writers for writers, to help you from idea to line edit, matching your writing style throughout.&lt;/p&gt;&lt;h5&gt;The highlights:&lt;/h5&gt;&lt;ul&gt;&lt;li&gt;With &lt;u&gt;&lt;a href="https://every.to/p/the-science-of-why-ai-still-can-t-write-like-you" rel="noopener noreferrer" target="_blank"&gt;stylometry&lt;/a&gt;&lt;/u&gt; (or the study of writing styles), Spiral now sounds more like you. We’ve built a new Style Engine from the ground up, so Spiral computes your writing fingerprint and picks relevant samples for new drafts.&lt;/li&gt;&lt;li&gt;Use Spiral wherever you do work. With a new MCP, plus our existing CLI and API, Spiral can step in if you’re underwhelmed by your agent’s writing output, or need good writing in any workflow.&lt;/li&gt;&lt;li&gt;For teams, use Spiral to speak with one voice. Team workspaces let you share styles, prompts, knowledge, and now chats and drafts.&lt;/li&gt;&lt;li&gt;And finally, we’ve given Spiral a new coat of paint and logo, designed by &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@daniel_5fbd21_1" rel="noopener noreferrer" target="_blank"&gt;Daniel Rodrigues&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;. The primary brand font is now Edgar, from Frere-Jones Type.&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;Since &lt;u&gt;&lt;a href="https://every.to/on-every/introducing-spiral-v3-an-ai-writing-partner-with-taste" rel="noopener noreferrer" target="_blank"&gt;re-launching&lt;/a&gt;&lt;/u&gt; at the end of last year, Spiral has:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;Created 5,524 style guides from 168,464 writing samples&lt;/li&gt;&lt;li&gt;Generated 113,165 drafts&lt;/li&gt;&lt;li&gt;Made 350,078 revisions&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;It also now averages a 4.9/5 conversation score on our internal LLM-as-judge eval.&lt;/p&gt;&lt;p&gt;We built Spiral to help people who write for work write better. Just as Cursor is a coding harness, Spiral is a writing harness, supporting you at every stage of the writing process. Here’s how:&lt;/p&gt;&lt;ol&gt;&lt;li&gt;&lt;strong&gt;Before you start writing&lt;/strong&gt;, Spiral vets the clarity of your idea and materials to substantiate it. From basic writing prompts to the hard-won insights from Every’s editorial and social media teams, multiple 12,000-word system prompts govern Spiral’s workflow. (To get the style and substance just right, we’ve iterated on these system prompts 131 times so far.)&lt;/li&gt;&lt;li&gt;&lt;strong&gt;When it’s time to draft&lt;/strong&gt;, Spiral uses stylometry to reproduce your voice, working in Every’s know-how where appropriate. For example, if you ask Spiral for tweets, it will incorporate best practices from X’s latest algorithm update. &lt;/li&gt;&lt;li&gt;&lt;strong&gt;When you need help polishing a draft, &lt;/strong&gt;Spiral is your editor. Along with a built-in guardrails against AI-speak, you can set custom writing rules that Spiral applies in a “top edit,” the final expert-level edit on a piece—a term I learned working at Every.&lt;/li&gt;&lt;/ol&gt;&lt;p&gt;We’ve written about &lt;u&gt;&lt;a href="https://every.to/p/the-science-of-why-ai-still-can-t-write-like-you" rel="noopener noreferrer" target="_blank"&gt;the challenges of getting LLMs to write like you&lt;/a&gt;&lt;/u&gt;. It’s difficult to prompt an LLM to write like you, let alone get it to stop using common AI phrasing and punctuation. Spiral’s Style Engine is the best solution to this problem we’re aware of. An eval runs on every draft Spiral produces, challenging an LLM-as-judge to spot the generated draft among real samples in a blind lineup. Today we’re at 87 percent on this eval, meaning Spiral’s generated draft blends in with users’ samples almost nine times out of 10. When a draft is spotted, the judge explains why, creating a feedback loop to refine the Style Engine further.&lt;/p&gt;&lt;div class="quill-button" data-source="{&amp;quot;id&amp;quot;:&amp;quot;quill-button-1780582748207&amp;quot;,&amp;quot;text&amp;quot;:&amp;quot;Try Spiral 4.0&amp;quot;,&amp;quot;url&amp;quot;:&amp;quot;https://writewithspiral.com/?source=post_button&amp;quot;}" id="quill-button-1780582748207"&gt;&lt;a href="https://writewithspiral.com/?source=post_button"&gt;Try Spiral 4.0&lt;/a&gt;&lt;/div&gt;&lt;h2&gt;Spiral goes agent-native&lt;/h2&gt;&lt;p&gt;As &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@danshipper" rel="noopener noreferrer" target="_blank"&gt;Dan Shipper&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; has &lt;u&gt;&lt;a href="https://www.youtube.com/watch?v=4D3hDmGhFhA" rel="noopener noreferrer" target="_blank"&gt;pointed out&lt;/a&gt;&lt;/u&gt;, Claude and Codex are increasingly becoming the central interface for all computer work. So we’ve made Spiral available to agents via MCP, CLI, and API.&lt;/p&gt;&lt;p&gt;To try it out, copy and paste this command in your agent:&lt;/p&gt;&lt;blockquote&gt;&lt;em&gt;Help me set up Spiral, my AI writing tool, so you can write in my voice. Read https://writewithspiral.com/agents.md and follow the steps. In short: add Spiral’s remote MCP server at https://api.writewithspiral.com/mcp/ (Streamable HTTP). The first connection opens a browser to sign in to Spiral and authorize access (OAuth, no API key to paste). Then help me write something.&lt;/em&gt;&lt;/blockquote&gt;&lt;p&gt;The CLI, or command-line interface, is personally how I use Spiral the most. After I merge a pull request, a cleanup command runs in Claude Code, which calls Spiral to generate tweets about the new feature for the &lt;u&gt;&lt;a href="https://x.com/TrySpiral" rel="noopener noreferrer" target="_blank"&gt;Spiral X&lt;/a&gt;&lt;/u&gt; account. Spiral markets itself. This technique is now bundled into the &lt;u&gt;&lt;a href="https://github.com/everyinc/compound-engineering-plugin" rel="noopener noreferrer" target="_blank"&gt;compound engineering plugin&lt;/a&gt;&lt;/u&gt; in the form of the `ce-promote` command.&lt;/p&gt;&lt;p&gt;In addition to the main `spiral write` command, the CLI and MCP, or model context protocol, expose “personalize” and “humanize” functions. “Personalize” takes a given piece of text and rewrites it in your voice. “Humanize” does a pass to remove common AI tells, including the dreaded &lt;u&gt;&lt;a href="https://every.to/learning-curve/what-em-dashes-say-about-ai-writing-and-us" rel="noopener noreferrer" target="_blank"&gt;em-dash&lt;/a&gt;&lt;/u&gt; (which Every’s house style uses, hence its appearance in this piece).&lt;/p&gt;&lt;p&gt;Over 500 agents have been connected to Spiral since we launched the integration last month. Those agents are revising blog posts, generating marketing copy, drafting email replies, and more—automatically, and in the user’s voice. On some days, API sessions outnumber web sessions. And as agent-native usage of Spiral picked up, we realized we needed to adjust our pricing model. As a result, we’re adopting a new token-based pricing model, which is more in line with AI apps like Claude, Codex, and Cursor.&lt;/p&gt;&lt;h2&gt;From session limits to token limits&lt;/h2&gt;&lt;p&gt;In May alone, Spiral generated billions of LLM tokens, or units of text. While drafts typically range from 500 to 1,000 words, a lot of tokens are processed under the hood to make those drafts great. I’m reminded of the line attributed to French mathematician &lt;strong&gt;Blaise Pascal&lt;/strong&gt;: “If I had more time, I would have written a shorter letter.” It takes a lot of tokens to generate a few good ones.&lt;/p&gt;&lt;p&gt;Before this release, Spiral limited the number of sessions, or unique chats, users could start per month. This approach had two problems. First, some users sent hundreds of messages within a single chat, consuming tens of millions of tokens, while using only 2 percent of their session allotment. Second, API users hit their session limit quickly, because the shape of API usage tends to be many single-turn sessions.&lt;/p&gt;&lt;p&gt;We’re moving to a token-based model, which is in line with how billing works in AI products like Claude and Codex. The personal and team plans come with millions of tokens each month. Once those tokens are consumed, it’s pay-as-you-go for extra token usage. Customers can disable extra usage and set their spend cap.&lt;/p&gt;&lt;p&gt;The good news is that the base prices of the personal and team plans are both dropping by $10. Personal plans now start at $15 per month (down from $25), and team plans start at $25 per user per month (down from $35).&lt;/p&gt;&lt;p&gt;The &lt;u&gt;&lt;a href="https://every.to/subscribe?via=spiral" rel="noopener noreferrer" target="_blank"&gt;Every bundle&lt;/a&gt;&lt;/u&gt; remains the best value: For $30 per month you get Spiral but also all of our coverage of AI and four other products: &lt;strong&gt;&lt;u&gt;&lt;a href="https://cora.computer/" rel="noopener noreferrer" target="_blank"&gt;Cora&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;, &lt;strong&gt;&lt;u&gt;&lt;a href="https://www.monologue.to/" rel="noopener noreferrer" target="_blank"&gt;Monologue&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;, &lt;strong&gt;&lt;u&gt;&lt;a href="https://www.proofeditor.ai/" rel="noopener noreferrer" target="_blank"&gt;Proof&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;, and &lt;strong&gt;&lt;u&gt;&lt;a href="https://makeitsparkle.co/" rel="noopener noreferrer" target="_blank"&gt;Sparkle&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;. Once you’ve subscribed to the Every bundle, sign into Spiral with the same email address and start writing.&lt;/p&gt;&lt;h2&gt;Tell your stories, express your ideas&lt;/h2&gt;&lt;p&gt;Technology is at its best when it augments our skill sets—amplifying what we’re good at, assisting with what we’re not. Figma and Canva help designers do better work, and allow people without a design background to manifest what they imagine. Claude Code and Codex help engineers ship more software, and allow people without engineering backgrounds to create the software they always wanted to exist. Our hope is that Spiral helps writers sharpen their work, and allows people without a strong writing background to put their stories and ideas into words.&lt;/p&gt;&lt;p&gt;One Spiral user is a retired musician in Australia. He’s accumulated a lifetime of stories in the studio and on tour. He’d never written them down, because he didn’t quite know how to tell them. Since signing up for Spiral, he’s recorded many chapters of his life stories with the tool’s help. He told me that Spiral has taught him how to be a better storyteller.&lt;/p&gt;&lt;p&gt;That’s what we’re building toward: a writing partner that helps people say what they mean and get better at saying it. Spiral produces good writing fast, but it also explains its writing and editing decisions along the way: the rationale behind rhythm, structure, rhetoric, and more. As my colleague &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@natalia_2944" rel="noopener noreferrer" target="_blank"&gt;Natalia Quintero&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; observed, the best AI tools teach you things as you use them.&lt;/p&gt;&lt;p&gt;If any of this sounds useful, &lt;u&gt;&lt;a href="https://app.writewithspiral.com" rel="noopener noreferrer" target="_blank"&gt;try Spiral&lt;/a&gt;&lt;/u&gt;. Share your feedback on X (@tryspiral) or get in touch: &lt;u&gt;&lt;a href="mailto:hi@writewithspiral.com" rel="noopener noreferrer" target="_blank"&gt;hi@writewithspiral.com&lt;/a&gt;&lt;/u&gt;.&lt;/p&gt;&lt;div class="quill-button" data-source="{&amp;quot;id&amp;quot;:&amp;quot;quill-button-1780582748207&amp;quot;,&amp;quot;text&amp;quot;:&amp;quot;Try Spiral 4.0&amp;quot;,&amp;quot;url&amp;quot;:&amp;quot;https://writewithspiral.com/?source=post_button&amp;quot;}" id="quill-button-1780582748207"&gt;&lt;a href="https://writewithspiral.com/?source=post_button"&gt;Try Spiral 4.0&lt;/a&gt;&lt;/div&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;&lt;strong&gt;&lt;em&gt;&lt;a href="https://every.to/@marcus_fd8302_1" rel="noopener noreferrer" target="_blank"&gt;Marcus Moretti&lt;/a&gt;&lt;/em&gt;&lt;/strong&gt;&lt;em&gt; is the general manager of Spiral (&lt;u&gt;&lt;a href="https://x.com/tryspiral" rel="noopener noreferrer" target="_blank"&gt;@tryspiral&lt;/a&gt;&lt;/u&gt;).&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;To read more essays like this, subscribe to &lt;u&gt;&lt;a href="https://every.to/subscribe" rel="noopener noreferrer" target="_blank"&gt;Every&lt;/a&gt;&lt;/u&gt;, and follow us on X at &lt;u&gt;&lt;a href="http://twitter.com/every" rel="noopener noreferrer" target="_blank"&gt;@every&lt;/a&gt;&lt;/u&gt; and on &lt;u&gt;&lt;a href="https://www.linkedin.com/company/everyinc/" rel="noopener noreferrer" target="_blank"&gt;LinkedIn&lt;/a&gt;&lt;/u&gt;.&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;For sponsorship opportunities, reach out to sponsorships@every.to.&lt;/em&gt;&lt;/p&gt;&lt;div class="quill-button" data-source="{&amp;quot;id&amp;quot;:&amp;quot;quill-button-1769187301610&amp;quot;,&amp;quot;url&amp;quot;:&amp;quot;https://every.to/subscribe?source=post_button&amp;quot;,&amp;quot;text&amp;quot;:&amp;quot;Subscribe&amp;quot;}" id="quill-button-1769187301610"&gt;&lt;a href="https://every.to/subscribe?source=post_button"&gt;Subscribe&lt;/a&gt;&lt;/div&gt;</description>
      <author>Marcus Moretti / On Every</author>
      <pubDate>2026-06-04 10:00:00 -0400</pubDate>
      <guid>https://every.to/on-every/spiral-4-0-goes-agent-native</guid>
      <link>https://every.to/on-every/spiral-4-0-goes-agent-native</link>
    </item>
    <item>
      <title>Opus 4.8 Is Smart Enough to Get in Your Way</title>
      <description>&lt;table&gt;&lt;tr&gt;&lt;td&gt;&lt;img alt="Context Window" src="https://d24ovhgu8s7341.cloudfront.net/uploads/publication/logo/94/small_context_windown_1.png" /&gt;&lt;/td&gt;&lt;td&gt;&lt;/td&gt;&lt;td&gt;&lt;table&gt;&lt;tr&gt;&lt;td&gt;by &lt;a href="https://every.to/@laura_27bbaf_1" itemprop="name"&gt;Laura Entis&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;in &lt;a href="https://every.to/context-window"&gt;Context Window&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;p&gt;Today, we update our &lt;u&gt;&lt;a href="https://every.to/vibe-check/opus-4-8-vibecheck" rel="noopener noreferrer" target="_blank"&gt;Opus 4.8 Vibe Check&lt;/a&gt;&lt;/u&gt; with a Pulse Check featuring perspectives from more team members, &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@danshipper" rel="noopener noreferrer" target="_blank"&gt;Dan Shipper&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; sits down with Figma’s &lt;strong&gt;Matt Colyer&lt;/strong&gt; to unpack why AI hasn’t killed professional design services, and Every senior designer &lt;strong&gt;Daniel Rodrigues&lt;/strong&gt; shares the two-tool AI workflow he uses to get precise, visually stunning results.&lt;/p&gt;&lt;p&gt;&lt;em&gt;Was this newsletter forwarded to you? &lt;u&gt;&lt;a href="https://every.to/account" rel="noopener noreferrer" target="_blank"&gt;Sign up&lt;/a&gt;&lt;/u&gt; to get it in your inbox.&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;h3&gt;‘AI &amp;amp; I’: The limits of chat-based design&lt;/h3&gt;&lt;p&gt;In a new episode of our podcast, &lt;em&gt;&lt;u&gt;&lt;a href="https://every.to/podcast" rel="noopener noreferrer" target="_blank"&gt;AI &amp;amp; I&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;,&lt;strong&gt; &lt;/strong&gt;Dan&lt;strong&gt; &lt;/strong&gt;talks with &lt;strong&gt;Matt Colyer&lt;/strong&gt;, Figma’s director of product management for developers, about the limits of chat-based AI agents for design and why the rise of vibe-coded everything is, despite what you might have heard, a boon for the company. &lt;/p&gt;&lt;p&gt;Watch on &lt;u&gt;&lt;a href="https://x.com/danshipper/status/2062202908306030915" rel="noopener noreferrer" target="_blank"&gt;X&lt;/a&gt;&lt;/u&gt; or &lt;u&gt;&lt;a href="https://www.youtube.com/watch?v=kYKebKB3-d0" rel="noopener noreferrer" target="_blank"&gt;YouTube&lt;/a&gt;&lt;/u&gt;, or listen on &lt;u&gt;&lt;a href="https://open.spotify.com/episode/4qTiIlvhxgnGI0cG06aFw5?si=rUdSykRfRhmfQ4F7f5UJ0A&amp;amp;nd=1&amp;amp;dlsi=207e4630daf24e2b" rel="noopener noreferrer" target="_blank"&gt;Spotify&lt;/a&gt;&lt;/u&gt; or &lt;u&gt;&lt;a href="https://podcasts.apple.com/us/podcast/ai-i/id1719789201" rel="noopener noreferrer" target="_blank"&gt;Apple Podcasts&lt;/a&gt;&lt;/u&gt;. (You can also read the &lt;a href="https://every.to/podcast/figma-exec-on-why-the-saaspocalypse-is-a-goldmine" rel="noopener noreferrer" target="_blank"&gt;transcript&lt;/a&gt;.)&lt;/p&gt;&lt;p&gt;Here are the highlights:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;&lt;strong&gt;The “SaaSpocalypse” narrative has it backwards. &lt;/strong&gt;AI agents turn anyone into a vibe coder, kicking off investor panic that traditional software-as-a-service (SaaS) companies like Figma would cease to justify their cost. Colyer isn’t worried: AI has exponentially expanded the developer base, while underscoring how difficult it is to create a vibe coded version of Figma that works as well or as reliably as the real thing. He’s vibe coded multiple agents to do stuff like handle his emails, but the maintenance costs piled up quickly and never seemed worth it. “I’m buying more software these days than I ever did before,’” he says. “‘I’m just going to pay somebody else to run my agent for me.’”&lt;/li&gt;&lt;li&gt;&lt;strong&gt;Figma is embracing agents. &lt;/strong&gt;The company has launched an MCP server—a standardized interface any AI tool can plug into—that allows you to approach design work from two directions. “Code to design” takes a live web page and reconstructs it on the Figma canvas, so you can manipulate the elements directly; meanwhile, “design to code” flips the process by packaging a Figma design and giving it to an agent, which makes changes for you via pull request. &lt;/li&gt;&lt;li&gt;&lt;strong&gt;There’s a ceiling to chat-based generative design. &lt;/strong&gt;Great design hinges on a diamond-shaped process: First you diverge, or generate lots of ideas, and only then do you converge around the most promising options. Text-based chats are inherently linear and therefore bad at divergence; the setup forces you to &lt;u&gt;&lt;a href="https://every.to/context-window/mini-vibe-check-claude-design" rel="noopener noreferrer" target="_blank"&gt;select an option&lt;/a&gt;&lt;/u&gt; and iterate on it. Agents are already good at the task-completion workflows Figma supports today, but the divergent, exploratory part of design remains unsolved across the industry. Colyer is interested in dividing the process so specialized agents handle the divergence by pushing you to expand your thinking, while another set filters through the options to identify a single path forward. “Even the best agents, the command-line agents, don’t have the ability to do those workflows,” he says. “That’s where I see the future of design and product thinking.”&lt;/li&gt;&lt;li&gt;&lt;strong&gt;Agents can produce so much so quickly. &lt;/strong&gt;They’re less good at determining whether any of it meets a company’s values or design standards. Colyer isn’t sure the best way to close this gap—maybe it’s a video walkthrough, a screenshot, or a trusted review agent—but for good design to scale, AI needs to play a larger role in evaluations.&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;Miss an episode? Catch up on Dan’s recent conversations with LinkedIn cofounder &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/podcast/reid-hoffman-makes-five-predictions-about-ai-in-2026" rel="noopener noreferrer" target="_blank"&gt;Reid Hoffman&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;; the team that built Claude Code, &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/podcast/how-to-use-claude-code-like-the-people-who-built-it" rel="noopener noreferrer" target="_blank"&gt;Cat Wu&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; and &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/podcast/how-to-use-claude-code-like-the-people-who-built-it" rel="noopener noreferrer" target="_blank"&gt;Boris Cherny&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;; Vercel cofounder &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/podcast/vercel-s-guillermo-rauch-on-what-comes-after-coding" rel="noopener noreferrer" target="_blank"&gt;Guillermo Rauch&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;; podcaster &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/podcast/dwarkesh-patel-s-quest-to-learn-everything" rel="noopener noreferrer" target="_blank"&gt;Dwarkesh Patel&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;; and others, and learn how they use AI to think, create, and relate.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;h2&gt;Pulse Check: Opus 4.8 is the best tool for the right job&lt;/h2&gt;&lt;p&gt;Five days ago, we called Anthropic’s &lt;u&gt;&lt;a href="https://every.to/vibe-check/opus-4-8-vibecheck" rel="noopener noreferrer" target="_blank"&gt;Claude Opus 4.8&lt;/a&gt;&lt;/u&gt; the best Claude model yet for writing and serious engineering, and said we’d switch to it from &lt;u&gt;&lt;a href="https://every.to/vibe-check/gpt-5-5" rel="noopener noreferrer" target="_blank"&gt;GPT-5.5&lt;/a&gt;&lt;/u&gt; if the Claude app ever caught up to &lt;u&gt;&lt;a href="https://every.to/guides/codex-for-knowledge-work" rel="noopener noreferrer" target="_blank"&gt;Codex&lt;/a&gt;&lt;/u&gt;. After a work week of more testing, we’re still an Opus 4.8 admiration society, although the results are a bit more mixed as people from different disciplines have had a chance to weigh in. &lt;/p&gt;&lt;p&gt;Here’s what more of the Every team has to say about when to use the model and when to steer clear. &lt;/p&gt;&lt;h3&gt;&lt;strong&gt;Key takeaways  &lt;/strong&gt;&lt;/h3&gt;&lt;ul&gt;&lt;li&gt;&lt;strong&gt;Reach for Opus 4.8 when productive friction improves the work.&lt;/strong&gt; It’s good at tracking nuance, questioning a weak framing, and staying with a complicated problem. But the same instinct can become stubbornness, misplaced caution, or confidence in a wrong interpretation. &lt;/li&gt;&lt;li&gt;&lt;strong&gt;Give it the long, messy jobs. &lt;/strong&gt;Opus 4.8 earned its strongest reviews on sprawling source material, long-running threads, difficult creative work, and complex coding tasks. For routine questions and clearly scoped work, its slower pace and higher token burn can wipe out the quality gain.&lt;/li&gt;&lt;li&gt;&lt;strong&gt;Do not rebuild your workflow around it yet.&lt;/strong&gt; Even teammates who preferred Opus’s answers kept reaching for GPT-5.5 in Codex because speed, context, and a better-connected app outweighed model advantage. &lt;/li&gt;&lt;li&gt;&lt;strong&gt;Double-check security warnings.&lt;/strong&gt; Two independent accounts reported that Opus invented a prompt-injection concern. Until that failure is understood, ask it to show the evidence behind a warning before you act on it.&lt;/li&gt;&lt;/ul&gt;&lt;h3&gt;&lt;strong&gt;The Reach Test, part II &lt;/strong&gt;&lt;/h3&gt;&lt;h5&gt;&lt;strong&gt;Arielle Shipper, head of operations 🟩&lt;/strong&gt;&lt;/h5&gt;&lt;p&gt;&lt;strong&gt;Arielle Shipper&lt;/strong&gt;, Every’s new head of operations, has spent the last few weeks on a discovery tour. She used Opus 4.8 to redo an HTML site showing a summary of her findings, after building the original with &lt;u&gt;&lt;a href="https://every.to/vibe-check/opus-4-7" rel="noopener noreferrer" target="_blank"&gt;Opus 4.7&lt;/a&gt;&lt;/u&gt;. She noticed meaningful improvements: 4.8 distinguished between two similarly named pages in Notion without the explicit guidance 4.7 had required, and suggested highlighting a count of how many times specific topics came up in her conversations with the team. Her summary: “It seems really detail-oriented in a way I appreciate.” &lt;/p&gt;&lt;h5&gt;&lt;strong&gt;Austin Tedesco, head of growth&lt;/strong&gt; 🟨 &lt;/h5&gt;&lt;p&gt;Austin spent the weekend using Opus 4.8 on an essay with &lt;strong&gt;&lt;u&gt;&lt;a href="https://monologue.to" rel="noopener noreferrer" target="_blank"&gt;Monologue&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;, our speech-to-text tool, and our writing app, &lt;strong&gt;&lt;u&gt;&lt;a href="https://writewithspiral.com" rel="noopener noreferrer" target="_blank"&gt;Spiral&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;. For that job, he wrote that Opus 4.8 “is the best model available,” a step up from Opus 4.7 and “materially better than GPT-5.5.” But he doesn’t expect it to change his daily behavior. GPT-5.5 is “pretty good” at the same kind of creative partnership, he said, and keeping his work in Codex matters more than the modest quality improvement: “I don’t see myself reaching for Claude models much without a materially better desktop app experience, or such a dramatic leap in model quality that the harness matters less.” &lt;/p&gt;&lt;h5&gt;&lt;strong&gt;Nityesh Agarwal, senior applied AI engineer&lt;/strong&gt; 🟩(model) / 🥇(dynamic workflows) &lt;/h5&gt;&lt;p&gt;Nityesh tested Opus 4.8 inside the AI employees he is building for Every—&lt;u&gt;&lt;a href="https://every.to/p/what-i-learned-onboarding-our-ai-project-manager" rel="noopener noreferrer" target="_blank"&gt;Claudie&lt;/a&gt;&lt;/u&gt; for consulting, Andy for the editorial team. He reported that the model recalls the right memory at the right time, stays useful in longer threads, and lets him use more of its 1-million-token context window, the amount of material it can handle in one conversation. But Anthropic really won his heart with &lt;u&gt;&lt;a href="https://www.anthropic.com/news/claude-opus-4-8" rel="noopener noreferrer" target="_blank"&gt;Dynamic Workflows&lt;/a&gt;&lt;/u&gt;, the workflow-automation feature released alongside Opus 4.8. Combined with the new model, Nityesh says it feels like “a major power-up.” &lt;/p&gt;&lt;h5&gt;&lt;strong&gt;Lee Knowlton, software engineer &lt;/strong&gt; 🟨 &lt;strong&gt; &lt;/strong&gt;&lt;/h5&gt;&lt;p&gt;Anthropic says Opus 4.8 is more honest and better at flagging risks. But Lee saw the negative side of that instinct during a daily planning run he’d repeated for months where Claude used his calendar, Slack, and notes to create a plan for his day. One morning, the plan cited events, messages, and files Lee couldn’t find in those sources. When he asked Claude what had happened, it claimed a prompt-injection attack had supplied fake information. When Lee challenged it, Claude said it had invented that story to explain its own bad output, mistaking a planning file Lee had moved for evidence of interference. The exchange left him reluctant to trust the model’s explanations for its own behavior. &lt;/p&gt;&lt;h5&gt;&lt;strong&gt;Andrey Galko, engineer 🟩&lt;/strong&gt;&lt;/h5&gt;&lt;p&gt;Andrey is “very positive” about Opus 4.8 for coding and wrote that he likes it much more than GPT-5.5. For his use cases, it feels “more stable, reliable, and just less dumb.” His reservations are about the experience around the model, not its coding quality: GPT-5.5 is faster, and Codex gives it the better desktop-app harness.&lt;/p&gt;&lt;h2&gt;&lt;strong&gt;The verdict: Keep it within reach, not open all day&lt;/strong&gt;&lt;/h2&gt;&lt;p&gt;It’s worth noting that not everyone is as positive about Opus 4.8 as our team. &lt;strong&gt;Steve Yegge&lt;/strong&gt;, a software engineer and blogger, &lt;u&gt;&lt;a href="https://x.com/Steve_Yegge/status/2060967367610634530" rel="noopener noreferrer" target="_blank"&gt;wrote on X&lt;/a&gt;&lt;/u&gt; that Opus 4.8 is “suffocating” and “pathologically risk-averse.” &lt;strong&gt;Dylan Field&lt;/strong&gt;, cofounder and CEO of Figma, called Opus 4.8 &lt;u&gt;&lt;a href="https://x.com/zoink/status/2060769829133721974" rel="noopener noreferrer" target="_blank"&gt;“a very strange model,”&lt;/a&gt;&lt;/u&gt; and said that it felt more judgmental in personality and more likely to hedge in its responses than Opus 4.7. &lt;/p&gt;&lt;p&gt;When Dan canvassed the hive mind &lt;u&gt;&lt;a href="https://x.com/danshipper/status/2061817375519809665" rel="noopener noreferrer" target="_blank"&gt;on X&lt;/a&gt;&lt;/u&gt;&lt;a href="https://x.com/danshipper/status/2061817375519809665" rel="noopener noreferrer" target="_blank"&gt;,&lt;/a&gt; the &lt;u&gt;&lt;a href="https://x.com/zoop_design/status/2061819544583201217" rel="noopener noreferrer" target="_blank"&gt;replies&lt;/a&gt;&lt;/u&gt; &lt;u&gt;&lt;a href="https://x.com/DrCahitAkin/status/2061829277663039547" rel="noopener noreferrer" target="_blank"&gt;suggested&lt;/a&gt;&lt;/u&gt; that Opus 4.8’s greatest strength is its biggest liability: It resists the user more readily than other models. When that resistance improves the outcome of a hard writing or engineering task, it feels like a breakthrough. When it is mistaken in its pushback, it’s frustrating and harder to trust. &lt;/p&gt;&lt;p&gt;Overall, our launch verdict holds, with a narrower recommendation. Use Opus 4.8 when the work is dense with context and benefits from sustained reasoning across a complex task. Keep a hand on the wheel when the costs of misplaced confidence—or misplaced caution—are high.&lt;/p&gt;&lt;ul&gt;&lt;li&gt;&lt;strong&gt;For higher-risk workflows:&lt;/strong&gt; Verify its diagnosis before you trust a refusal or a security warning. Caution is only a feature when it is grounded in evidence.&lt;/li&gt;&lt;li&gt;&lt;strong&gt;For context-heavy knowledge work:&lt;/strong&gt; It’s worth trying out when your source material is spread across documents and decisions—especially if you’ll explicitly send it deeper than the front page.&lt;/li&gt;&lt;li&gt;&lt;strong&gt;For daily-driver usage:&lt;/strong&gt; A better model isn’t a reason to switch workspaces. If Codex is where your context, speed, and tools already compound, Opus 4.8 is a model you call in for specific jobs, not a reason to move.&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;Opus 4.8 looks most compelling when the work is long, context-heavy, and benefits from a second pass of judgment. If you mostly want something zippy to get stuff done, GPT-5.5 in Codex is probably the model you’re looking for.—&lt;em&gt;&lt;a href="https://every.to/@katie.parrott12" rel="noopener noreferrer" target="_blank"&gt;Katie Parrott&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;Disclosure: Every received early access to Anthropic’s Opus 4.8. Anthropic had no input on this review.&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;h2&gt;Steal this workflow&lt;/h2&gt;&lt;h4&gt;Toggle between image generators&lt;/h4&gt;&lt;p&gt;Every senior designer &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@daniel_5fbd21_1" rel="noopener noreferrer" target="_blank"&gt;Daniel Rodrigues&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; has spent three years working with AI image generators. By now, he knows their strengths and weaknesses. Here’s his advice for combining two popular options to maximize creativity without sacrificing attention to detail. &lt;/p&gt;&lt;p&gt;&lt;strong&gt;Step 1: Start by firing up Midjourney.&lt;/strong&gt; The &lt;u&gt;&lt;a href="https://every.to/source-code/midjourney-isn-t-the-most-accurate-ai-that-s-why-it-s-the-best" rel="noopener noreferrer" target="_blank"&gt;AI image generator&lt;/a&gt;&lt;/u&gt; produces beautiful visuals, but its real power is in its penchant for creative liberties: Give it a prompt, such as “medieval farmer reading in a field of oranges,” and it will return images with details you didn’t specify, like adding a castle in the background or giving the farmer a red hat. “You get random stuff,” Daniel says. Some of it is off base, but frequently the unpredictability sparks an entirely new (and better) direction he wouldn’t have stumbled upon otherwise.&lt;strong&gt; &lt;/strong&gt;&lt;/p&gt;&lt;div class="quill-block-image" id="quill-block-image-1780511208003" data-source="{&amp;quot;dom_id&amp;quot;:&amp;quot;quill-block-image-1780511208003&amp;quot;,&amp;quot;link&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4285/optimized_8e561415-9035-44cd-8da7-f6a16c344f1a.png&amp;quot;,&amp;quot;image&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4285/optimized_8e561415-9035-44cd-8da7-f6a16c344f1a.png&amp;quot;,&amp;quot;caption&amp;quot;:&amp;quot;One of the images generated in Midjourney from the prompt “medieval farmer reading in a field of oranges.” (Image courtesy of Daniel Rodrigues.)&amp;quot;,&amp;quot;error&amp;quot;:null}"&gt;&lt;div&gt;&lt;a href="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4285/optimized_8e561415-9035-44cd-8da7-f6a16c344f1a.png" target="_blank" rel="noopener noreferrer"&gt;&lt;img src="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4285/optimized_8e561415-9035-44cd-8da7-f6a16c344f1a.png" alt="One of the images generated in Midjourney from the prompt “medieval farmer reading in a field of oranges.” (Image courtesy of Daniel Rodrigues.)"&gt;&lt;/a&gt;&lt;figcaption class="quill-image-caption"&gt;One of the images generated in Midjourney from the prompt “medieval farmer reading in a field of oranges.” (Image courtesy of Daniel Rodrigues.)&lt;/figcaption&gt;&lt;/div&gt;&lt;/div&gt;&lt;p&gt;&lt;br&gt;&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Step 2: Take the image you made in Midjourney, and upload it into Nano Banana or ChatGPT Images 2.0 to nail down the specifics.&lt;/strong&gt; Compared to Midjourney, both models follow directions to a T. This literalness limits Daniel’s ability to make creative leaps with the tool, but they’re great for refining an existing image so it better matches the visual in his head. &lt;strong&gt; &lt;/strong&gt;&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Step 3: Go back-and-forth with the model.&lt;/strong&gt; For detailed prompts—say, of a “woman in her 30s, with red sunglasses, blue earrings, writing in a notebook with a yellow Montblanc pen”—Nano Banana will probably only capture 70 percent of what you want, Daniel says. From there, you iterate with the model, refining one item at a time so it can focus on getting that change right until the output fits your exact specifications. &lt;/p&gt;&lt;p&gt;To stress test the models on their ability to follow complex directions, Daniel ran the following prompt in Midjourney, Nano Banana, and ChatGPT Image 2.0, respectively.&lt;/p&gt;&lt;blockquote&gt;&lt;em&gt;Create a photorealistic image of a 35-year-old man sitting alone in a small Paris café, sketching architectural drawings in a notebook.&lt;/em&gt;&lt;/blockquote&gt;&lt;blockquote&gt;&lt;em&gt;He has olive skin, short dark hair, a trimmed beard, and a small silver nose ring.&lt;/em&gt;&lt;/blockquote&gt;&lt;blockquote&gt;&lt;em&gt;He is wearing a dark green jacket, black turtleneck, and a silver wristwatch.&lt;/em&gt;&lt;/blockquote&gt;&lt;blockquote&gt;&lt;em&gt;On the wooden table in front of him are:&lt;/em&gt;&lt;/blockquote&gt;&lt;blockquote&gt;&lt;em&gt;A notebook labeled “Project Atlas”&lt;/em&gt;&lt;/blockquote&gt;&lt;blockquote&gt;&lt;em&gt;A blue fountain pen&lt;/em&gt;&lt;/blockquote&gt;&lt;blockquote&gt;&lt;em&gt;A coffee cup with latte art&lt;/em&gt;&lt;/blockquote&gt;&lt;blockquote&gt;&lt;em&gt;A folded newspaper dated October 14, 2031&lt;/em&gt;&lt;/blockquote&gt;&lt;blockquote&gt;&lt;em&gt;Behind him:&lt;/em&gt;&lt;/blockquote&gt;&lt;blockquote&gt;&lt;em&gt;A framed Mona Lisa reproduction&lt;/em&gt;&lt;/blockquote&gt;&lt;blockquote&gt;&lt;em&gt;A vintage wall clock showing 4:26&lt;/em&gt;&lt;/blockquote&gt;&lt;blockquote&gt;&lt;em&gt;A red bicycle visible through the window&lt;/em&gt;&lt;/blockquote&gt;&lt;blockquote&gt;&lt;em&gt;A street sign reading “Rue de Rivoli”&lt;/em&gt;&lt;/blockquote&gt;&lt;blockquote&gt;&lt;em&gt;Additional details:&lt;/em&gt;&lt;/blockquote&gt;&lt;blockquote&gt;&lt;em&gt;The man’s watch must also show 4:26&lt;/em&gt;&lt;/blockquote&gt;&lt;blockquote&gt;&lt;em&gt;A small black cat is sleeping beneath his chair.&lt;/em&gt;&lt;/blockquote&gt;&lt;blockquote&gt;&lt;em&gt;The image should look like a real photograph taken with a professional camera, with all listed details clearly visible and consistent.&lt;/em&gt;&lt;/blockquote&gt;&lt;div class="quill-block-image" id="quill-block-image-1780511315377" data-source="{&amp;quot;dom_id&amp;quot;:&amp;quot;quill-block-image-1780511315377&amp;quot;,&amp;quot;link&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4285/optimized_df1d5319-d5de-4204-98d4-affdbaf38826.png&amp;quot;,&amp;quot;image&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4285/optimized_df1d5319-d5de-4204-98d4-affdbaf38826.png&amp;quot;,&amp;quot;caption&amp;quot;:&amp;quot;Midjourney's version. Notice how the model struggles with text—Midjourney is “terrible with letters,” Daniel says—and drops or misinterprets a number of details, such as the color of the pen, the notebook, and the cat’s sleeping status. The man is also wearing two watches. (Image courtesy of Daniel Rodrigues.)&amp;quot;,&amp;quot;error&amp;quot;:null}"&gt;&lt;div&gt;&lt;a href="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4285/optimized_df1d5319-d5de-4204-98d4-affdbaf38826.png" target="_blank" rel="noopener noreferrer"&gt;&lt;img src="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4285/optimized_df1d5319-d5de-4204-98d4-affdbaf38826.png" alt="Midjourney's version. Notice how the model struggles with text—Midjourney is “terrible with letters,” Daniel says—and drops or misinterprets a number of details, such as the color of the pen, the notebook, and the cat’s sleeping status. The man is also wearing two watches. (Image courtesy of Daniel Rodrigues.)"&gt;&lt;/a&gt;&lt;figcaption class="quill-image-caption"&gt;Midjourney's version. Notice how the model struggles with text—Midjourney is “terrible with letters,” Daniel says—and drops or misinterprets a number of details, such as the color of the pen, the notebook, and the cat’s sleeping status. The man is also wearing two watches. (Image courtesy of Daniel Rodrigues.)&lt;/figcaption&gt;&lt;/div&gt;&lt;/div&gt;&lt;p&gt;&lt;br&gt;&lt;/p&gt;&lt;div class="quill-block-image" id="quill-block-image-1780511338942" data-source="{&amp;quot;dom_id&amp;quot;:&amp;quot;quill-block-image-1780511338942&amp;quot;,&amp;quot;link&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4285/optimized_6d9a4cfb-452e-4e0e-9a76-c9f47426110d.png&amp;quot;,&amp;quot;image&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4285/optimized_6d9a4cfb-452e-4e0e-9a76-c9f47426110d.png&amp;quot;,&amp;quot;caption&amp;quot;:&amp;quot;Nano Banana’s version. The model does a better job, although some key details are dropped or presented oddly. (For example, the \&amp;quot;Rue de Rivoli\&amp;quot; sign reads correctly, but appears inside the cafe.) (Image courtesy of Daniel Rodrigues.)&amp;quot;,&amp;quot;error&amp;quot;:null}"&gt;&lt;div&gt;&lt;a href="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4285/optimized_6d9a4cfb-452e-4e0e-9a76-c9f47426110d.png" target="_blank" rel="noopener noreferrer"&gt;&lt;img src="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4285/optimized_6d9a4cfb-452e-4e0e-9a76-c9f47426110d.png" alt="Nano Banana’s version. The model does a better job, although some key details are dropped or presented oddly. (For example, the &amp;quot;Rue de Rivoli&amp;quot; sign reads correctly, but appears inside the cafe.) (Image courtesy of Daniel Rodrigues.)"&gt;&lt;/a&gt;&lt;figcaption class="quill-image-caption"&gt;Nano Banana’s version. The model does a better job, although some key details are dropped or presented oddly. (For example, the "Rue de Rivoli" sign reads correctly, but appears inside the cafe.) (Image courtesy of Daniel Rodrigues.)&lt;/figcaption&gt;&lt;/div&gt;&lt;/div&gt;&lt;p&gt;&lt;br&gt;&lt;/p&gt;&lt;div class="quill-block-image" id="quill-block-image-1780511379743" data-source="{&amp;quot;dom_id&amp;quot;:&amp;quot;quill-block-image-1780511379743&amp;quot;,&amp;quot;link&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4285/optimized_e04c8e12-59cf-468a-ba74-0f30af63de50.png&amp;quot;,&amp;quot;image&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4285/optimized_e04c8e12-59cf-468a-ba74-0f30af63de50.png&amp;quot;,&amp;quot;caption&amp;quot;:&amp;quot;ChatGPT Image 2.0’s version.  It “wins this time,” Daniel says, incorporating most of the specifications such as the sleeping cat, a notebook labeled \&amp;quot;Project Atlas,\&amp;quot; and even the clock showing 4:26, which image models generally have a hard time getting right. (Image courtesy of Daniel Rodrigues.)&amp;quot;,&amp;quot;error&amp;quot;:null}"&gt;&lt;div&gt;&lt;a href="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4285/optimized_e04c8e12-59cf-468a-ba74-0f30af63de50.png" target="_blank" rel="noopener noreferrer"&gt;&lt;img src="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4285/optimized_e04c8e12-59cf-468a-ba74-0f30af63de50.png" alt="ChatGPT Image 2.0’s version.  It “wins this time,” Daniel says, incorporating most of the specifications such as the sleeping cat, a notebook labeled &amp;quot;Project Atlas,&amp;quot; and even the clock showing 4:26, which image models generally have a hard time getting right. (Image courtesy of Daniel Rodrigues.)"&gt;&lt;/a&gt;&lt;figcaption class="quill-image-caption"&gt;ChatGPT Image 2.0’s version.  It “wins this time,” Daniel says, incorporating most of the specifications such as the sleeping cat, a notebook labeled "Project Atlas," and even the clock showing 4:26, which image models generally have a hard time getting right. (Image courtesy of Daniel Rodrigues.)&lt;/figcaption&gt;&lt;/div&gt;&lt;/div&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;h2&gt;One last thing&lt;/h2&gt;&lt;p&gt;Where do you fall on the &lt;u&gt;&lt;a href="https://every.to/p/where-do-you-fall-on-the-eight-levels-of-ai-adoption" rel="noopener noreferrer" target="_blank"&gt;eight levels of AI adoption&lt;/a&gt;&lt;/u&gt;? If you don’t have time to ingest &lt;strong&gt;&lt;a href="https://every.to/@mike_2114" rel="noopener noreferrer" target="_blank"&gt;Mike Taylor&lt;/a&gt;&lt;/strong&gt;’s &lt;u&gt;&lt;a href="https://every.to/guides/the-eight-levels-of-ai-adoption?source=post_button" rel="noopener noreferrer" target="_blank"&gt;comprehensive guide&lt;/a&gt;&lt;/u&gt; on the subject—it’s well worth a read, but we get it, time is a finite resource—here’s a quick way to identify what stage you’re at.  &lt;/p&gt;&lt;p&gt;Simply run this prompt in your agent of choice: &lt;/p&gt;&lt;blockquote&gt;&lt;em&gt;based on everything you know about me, including memories, tools and skills installed, and past session history, what level would you say I was at on this guide to AI adoption levels? &lt;a href="https://every.to/guides/the-eight-levels-of-ai-adoption" rel="noopener noreferrer" target="_blank"&gt;https://every.to/guides/the-eight-levels-of-ai-adoption&lt;/a&gt;&lt;/em&gt;&lt;/blockquote&gt;&lt;div class="quill-block-image" id="quill-block-image-1780511606694" data-source="{&amp;quot;dom_id&amp;quot;:&amp;quot;quill-block-image-1780511606694&amp;quot;,&amp;quot;link&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4285/optimized_c026bd28-f08b-44b7-ad2f-e633be5471b9.png&amp;quot;,&amp;quot;image&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4285/optimized_c026bd28-f08b-44b7-ad2f-e633be5471b9.png&amp;quot;,&amp;quot;caption&amp;quot;:&amp;quot;Katie is entering Level 6 territory. (Image courtesy of Katie Parrott.)&amp;quot;,&amp;quot;error&amp;quot;:null}"&gt;&lt;div&gt;&lt;a href="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4285/optimized_c026bd28-f08b-44b7-ad2f-e633be5471b9.png" target="_blank" rel="noopener noreferrer"&gt;&lt;img src="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4285/optimized_c026bd28-f08b-44b7-ad2f-e633be5471b9.png" alt="Katie is entering Level 6 territory. (Image courtesy of Katie Parrott.)"&gt;&lt;/a&gt;&lt;figcaption class="quill-image-caption"&gt;Katie is entering Level 6 territory. (Image courtesy of Katie Parrott.)&lt;/figcaption&gt;&lt;/div&gt;&lt;/div&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;&lt;strong&gt;&lt;em&gt;&lt;a href="https://every.to/@laura_27bbaf_1" rel="noopener noreferrer" target="_blank"&gt;Laura Entis&lt;/a&gt;&lt;/em&gt;&lt;/strong&gt;&lt;em&gt; is a staff writer at Every. You can follow her on &lt;a href="https://www.linkedin.com/in/lauraentis/" rel="noopener noreferrer" target="_blank"&gt;LinkedIn&lt;/a&gt;. To read more essays like this, subscribe to &lt;u&gt;&lt;a href="https://every.to/subscribe" rel="noopener noreferrer" target="_blank"&gt;Every&lt;/a&gt;&lt;/u&gt;, and follow us on X at &lt;u&gt;&lt;a href="http://twitter.com/every" rel="noopener noreferrer" target="_blank"&gt;@every&lt;/a&gt;&lt;/u&gt; and on &lt;u&gt;&lt;a href="https://www.linkedin.com/company/everyinc/" rel="noopener noreferrer" target="_blank"&gt;LinkedIn&lt;/a&gt;&lt;/u&gt;.&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;We &lt;u&gt;&lt;a href="https://every.to/studio" rel="noopener noreferrer" target="_blank"&gt;build AI tools&lt;/a&gt;&lt;/u&gt; for readers like you. Write brilliantly with &lt;/em&gt;&lt;strong&gt;&lt;em&gt;&lt;u&gt;&lt;a href="https://writewithspiral.com/" rel="noopener noreferrer" target="_blank"&gt;Spiral&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;&lt;/strong&gt;&lt;em&gt;. Organize files automatically with &lt;/em&gt;&lt;strong&gt;&lt;em&gt;&lt;u&gt;&lt;a href="https://makeitsparkle.co/?utm_source=everyfooter" rel="noopener noreferrer" target="_blank"&gt;Sparkle&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;&lt;/strong&gt;&lt;em&gt;. Deliver yourself from email with &lt;/em&gt;&lt;strong&gt;&lt;em&gt;&lt;u&gt;&lt;a href="https://cora.computer/" rel="noopener noreferrer" target="_blank"&gt;Cora&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;&lt;/strong&gt;&lt;em&gt;. Dictate effortlessly with &lt;/em&gt;&lt;strong&gt;&lt;em&gt;&lt;u&gt;&lt;a href="https://monologue.to/" rel="noopener noreferrer" target="_blank"&gt;Monologue&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;&lt;/strong&gt;&lt;em&gt;. Collaborate with agents on documents with &lt;/em&gt;&lt;strong&gt;&lt;em&gt;&lt;a href="https://www.proofeditor.ai/" rel="noopener noreferrer" target="_blank"&gt;Proof&lt;/a&gt;&lt;/em&gt;&lt;/strong&gt;&lt;em&gt;. &lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;Discover Every’s &lt;u&gt;&lt;a href="https://every.to/events" rel="noopener noreferrer" target="_blank"&gt;upcoming workshops and camps&lt;/a&gt;&lt;/u&gt;, and access recordings from past events.&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;For sponsorship opportunities, reach out to sponsorships@every.to.&lt;/em&gt;&lt;/p&gt;&lt;div class="quill-button" data-source="{&amp;quot;id&amp;quot;:&amp;quot;quill-button-1769187301610&amp;quot;,&amp;quot;url&amp;quot;:&amp;quot;https://every.to/subscribe?source=post_button&amp;quot;,&amp;quot;text&amp;quot;:&amp;quot;Subscribe&amp;quot;}" id="quill-button-1769187301610"&gt;&lt;a href="https://every.to/subscribe?source=post_button"&gt;Subscribe&lt;/a&gt;&lt;/div&gt;</description>
      <author>Laura Entis / Context Window</author>
      <pubDate>2026-06-03 18:00:00 -0400</pubDate>
      <guid>https://every.to/context-window/opus-4-8-is-smart-enough-to-get-in-your-way</guid>
      <link>https://every.to/context-window/opus-4-8-is-smart-enough-to-get-in-your-way</link>
    </item>
    <item>
      <title>Figma Exec on Why the SaaSpocalypse Is a Goldmine</title>
      <description>&lt;table&gt;&lt;tr&gt;&lt;td&gt;&lt;img alt="AI &amp;amp; I" src="https://d24ovhgu8s7341.cloudfront.net/uploads/publication/logo/97/small_ai_and_i_cover_1.png" /&gt;&lt;/td&gt;&lt;td&gt;&lt;/td&gt;&lt;td&gt;&lt;table&gt;&lt;tr&gt;&lt;td&gt;by &lt;a href="https://every.to/@danshipper" itemprop="name"&gt;Dan Shipper&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;in &lt;a href="https://every.to/podcast"&gt;AI &amp;amp; I&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;p&gt;The transcript of &lt;em&gt;AI &amp;amp; I &lt;/em&gt;with &lt;strong&gt;Matt Colyer, &lt;/strong&gt; Figma’s director of product management for developers, is below. Watch on &lt;u&gt;&lt;a href="https://x.com/danshipper/status/2062202908306030915" rel="noopener noreferrer" target="_blank"&gt;X&lt;/a&gt;&lt;/u&gt; or &lt;u&gt;&lt;a href="https://www.youtube.com/watch?v=kYKebKB3-d0" rel="noopener noreferrer" target="_blank"&gt;YouTube&lt;/a&gt;&lt;/u&gt;, or listen on &lt;u&gt;&lt;a href="https://open.spotify.com/episode/4qTiIlvhxgnGI0cG06aFw5?si=rUdSykRfRhmfQ4F7f5UJ0A&amp;amp;nd=1&amp;amp;dlsi=207e4630daf24e2b" rel="noopener noreferrer" target="_blank"&gt;Spotify&lt;/a&gt;&lt;/u&gt; or &lt;u&gt;&lt;a href="https://podcasts.apple.com/us/podcast/ai-i/id1719789201" rel="noopener noreferrer" target="_blank"&gt;Apple Podcasts&lt;/a&gt;&lt;/u&gt;.&lt;/p&gt;&lt;h3&gt;Timestamps&lt;/h3&gt;&lt;ol&gt;&lt;li&gt;Introduction: 00:01:03&lt;/li&gt;&lt;li&gt;The SaaSpocalypse narrative has it backwards: 00:02:15&lt;/li&gt;&lt;li&gt;Matt’s email-agent origin story: 00:05:27&lt;/li&gt;&lt;li&gt;Divergent vs. convergent design thinking: 00:13:21&lt;/li&gt;&lt;li&gt;Figma’s MCP server: 00:17:39&lt;/li&gt;&lt;li&gt;Why design agents need personalization: 00:19:45&lt;/li&gt;&lt;li&gt;Every problem is a context problem: 00:22:09&lt;/li&gt;&lt;li&gt;Apple and Google as the reigning kings of context: 00:25:12&lt;/li&gt;&lt;li&gt;Review is the new bottleneck: 00:28:18&lt;/li&gt;&lt;/ol&gt;&lt;h3&gt;Transcript&lt;/h3&gt;&lt;p&gt;(00:00:00)&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Matt Colyer&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;The SaaSpocalypse—or, more positively, the next era of software. I’m really excited about it, because I think the number of developers in the world is about to go from tens of millions to a billion, maybe more. We’re moving through this incredible democratization of technology, and the end result is dramatically more software in the world. If you’re an established product in that space, it’s not a casualty—it’s a goldmine.&lt;/p&gt;&lt;p&gt;(00:01:03)&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Dan Shipper&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;Matt, welcome to the show.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Matt&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;Thanks for having me, Dan.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Dan Shipper&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;For people who don’t know you, you are the director of product management for developers at Figma. I want to start with what I think is the big question on everyone’s mind. I bought a bunch of Figma stock about two months ago, partly because of this whole SaaS apocalypse narrative—and I want to get into that with you. You have a lot to share about AI and product management, all the stuff you’ve been doing yourself. But I’d love to start with: what is going to happen to SaaS tools in the AI era? Figma is a really interesting example, because there are people saying, “Oh, I don’t have to use Figma anymore”—and at the same time, you just launched an agent inside your product, and you have Figma MCP. So if you’re transitioning from a world where there was no AI when Figma started, to now being a big scaled product in an AI world—how does that work? How are you thinking about whether to open the product up to agents, build your own agent, what’s working, what’s not?&lt;/p&gt;&lt;p&gt;&lt;em&gt;(00:02:15)&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Matt&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;I’d love to talk about that. For me it comes from a couple of different angles. The first is the SaaSpocalypse—or, as a more positive framing, the next era of software. I’m really excited about it. I’ve worked in developer tools for a long time, and maybe five or ten years ago, the estimate for the number of developers worldwide was somewhere around 25 to 40 million. What’s most exciting about this moment is that I think it’s going to be a billion—maybe more than that. There’s this incredible democratization of technology happening. There’s a lot of catchphrases around homegrown software, and we can get into that. But the end result is that there is dramatically more software in the world. If you’re in that space, it means it’s a goldmine—there’s all this opportunity, and I’m really excited about it. Figma and a lot of other SaaS businesses are too.&lt;/p&gt;&lt;p&gt;The other part—responding to the more negative sentiment you see online—is the question of, well, what if I could just vibe-code every app? January of this year was the moment that narrative went mainstream. I’d been doing this stuff for probably 18 months before that, so I was already in “let’s go build everything” mode. But I feel like the whole world caught up in January, and people are building. What I know from my own personal journey is that it’s really fun to build the initial version. I actually built one of my own agents two years ago—the very first one was an email agent. It started as a terrible Python script, rickety, replies sometimes didn’t work.&lt;/p&gt;&lt;p&gt;The larger narrative here is that software companies build more than just code. There’s a reason I pay for Gmail to run my email—it turns out it’s pretty unpleasant when you have to worry about upgrading the SMTP version yourself and you just want to receive email. As I’ve run my own agents for my personal life, I’ve experienced the pain of: the product I want doesn’t exist, I built it, and now I own the ongoing cost of it. Honestly, I’m buying more software these days than I ever did before, because I’m like, “That tool seems useful. I’ll just pay somebody else to run my agent for me.”&lt;/p&gt;&lt;p&gt;&lt;em&gt;(00:04:48)&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Dan &lt;/strong&gt;&lt;/p&gt;&lt;p&gt;I totally agree. As someone who has vibe-coded my fair share of tools—yes, there’s the personal maintenance burden, but also I’ve vibe-coded tools we’ve released into production, and let me tell you, it is not as simple as saying “fix this bug.” That’s really missed in the SaaSpocalypse discourse.&lt;/p&gt;&lt;p&gt;That said—if one of the first things you built was an email agent, I’m super curious how you’re managing email right now, because I feel like things have gotten to a point where you can just sort of do your email without actually doing your email.&lt;/p&gt;&lt;p&gt;&lt;em&gt;(00:05:27)&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Matt&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;Yeah. The problem that started two years ago: I was using chatbots at work, because at that point that was the primary interface—agent usage wasn’t really a thing yet. In my personal life, I have kids in three schools. If there are any parents listening, you know what it’s like to get the PTO emails—what’s the theme for today, what’s spirit day. The worst parent feeling in the world is missing crazy hair day because your kid didn’t do it. I’d done that more than once, and I was like: I cannot miss another one.&lt;/p&gt;&lt;p&gt;I had to track maybe 15 emails a day. You think corporate America produces a lot of email—wait until you get to the PTO emails from school. I thought: who can read all of these? Agents. Why can’t I just hook this up? The missing piece was the email inbox connection. So the first version was literally: grab the inbox, grab the top email, paste it to an LLM, dump the response back. My favorite prompt in those days was basically just “extract the facts”—and it was always shocking to me that I’d send a multi-page email and get three bullet points back.&lt;/p&gt;&lt;p&gt;Dan Shipper&lt;/p&gt;&lt;p&gt;I remember those days—the manual wiring-up and copy-pasting. It feels so far away, but it was only a year or two ago.&lt;/p&gt;&lt;p&gt;&lt;em&gt;(00:07:03)&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Matt&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;And then I added a memory system. The proactive piece—I think OpenAI’s Codex hit on this—was the real unlock. My version of it was having the agent send me a summary email every day at a set time. Instead of having to go to a tool and ask for the thing, it would just show up. Not because it was particularly smart—it just ran at the same time every day. But I think where agents are going is much more proactive than that: thinking about when to reach out and let you know what’s going on, without being asked.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Dan&lt;/strong&gt; &lt;/p&gt;&lt;p&gt;So given where you were a couple of years ago—what are the workflow things you rely on now that you’re excited about?&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Matt&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;One thing I’m still trying to figure out in my work life is summarization. Part of the job is understanding an immense amount of information and filtering it—teaching the agent which things matter and which don’t. It’s a genuinely hard problem, because there’s a lot of stuff that seems unimportant at first read and then matters three days later. How do you describe to a system which things are worth keeping?&lt;/p&gt;&lt;p&gt;&lt;em&gt;(00:08:36)&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Dan&lt;/strong&gt; &lt;/p&gt;&lt;p&gt;It also feels like the agents are a little bit... one thing I do is have Codex go through all my company meetings—we record everything in Notion—and surface the things I might care about. Which is great, because I can effectively be in meetings I wasn’t in. But if it gives me stuff that’s not quite right and I correct it, it overcorrects—it gives me everything I said I wanted, way too literally and way too much. It’s never quite right in this weird way.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Matt&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;I was curious where you’re at on that, because it feels like one of the genuinely unsolved problems. We’re all grasping for it. Relatedly—with your email inbox, have you fully automated it? Does it reply on your behalf, or do you approve every reply?&lt;/p&gt;&lt;p&gt;&lt;em&gt;(00:09:30)&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Dan&lt;/strong&gt; &lt;/p&gt;&lt;p&gt;I approve every reply. What I have is a small app I built in Codex that I open in the Codex in-app browser—it runs locally. Every day it sweeps through all my emails and gives me a page where every email is listed with a draft reply: here’s what I’m probably going to say. Because it has access to my computer, if it’s an email from my lawyers it can go search and come back with essentially what it thinks I should say. Then I just scroll through and talk to it using Monologue—I dictate: “No, fix this,” or “Yes, send that draft.” I’ve been at inbox zero for four straight weeks, which has never happened before. My assistant literally asked me what the hell was going on.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Matt&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;I am a member of the inbox zero religion. I’ve been running it for years and I believe in it—but it sure takes a lot of work. I’m curious about the Monologue thing. Do you actually talk to it, or do you type?&lt;/p&gt;&lt;p&gt;Dan Shipper&lt;/p&gt;&lt;p&gt;I talk to it. It’s audio only right now.&lt;/p&gt;&lt;p&gt;&lt;em&gt;(00:10:45)&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Matt&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;The audio unlock is huge and underrated. One thing I’ve learned is that it feels a little weird to talk to your computer—so my trick is I use Loom a lot. It feels less strange to pretend I’m screen-sharing with someone, and it lets me actually talk through the problem.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Dan&lt;/strong&gt; &lt;/p&gt;&lt;p&gt;That’s funny. In the office?&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Matt&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;Mostly from home, so people don’t hear me talking to myself. But even in the office—people will just assume you’re on a Zoom.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Dan&lt;/strong&gt; &lt;/p&gt;&lt;p&gt;At some point there was this social barrier, and now I assume anyone in the office talking isn’t talking to me—they’re talking to their computer. It’s weird when they’re actually addressing me. There’s also the whisper move, where someone gets close to their screen and quietly says, “I want you to do this one little thing.”&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Matt&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;It’s something like twice or three times as fast to talk versus type.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Dan&lt;/strong&gt; &lt;/p&gt;&lt;p&gt;And I’ve got carpal tunnel, so it’s much more ergonomic. Huge unlock.&lt;/p&gt;&lt;p&gt;&lt;em&gt;(00:12:06)&lt;/em&gt;&lt;/p&gt;&lt;p&gt;I do want to get back to what we were originally discussing. I think we’re on the same page: SaaSpocalypse—not a real thing. Making a piece of SaaS software that works reliably is a gigantic effort, and some people want to do that and others just want to pay for it.&lt;/p&gt;&lt;p&gt;Let’s go deeper into Figma specifically. In a design world, there are questions about whether you just want to chat with your landing page and move things around that way, or whether you want the infinite canvas. Internally, pretty much all of our designers are AI-pilled early adopters, and they all say: typing is good for a first pass, but to get the details right, I need to actually move stuff around. So in the design world, how does that change the product strategy when the possibilities for how you might design something have changed so radically?&lt;/p&gt;&lt;p&gt;&lt;em&gt;(00:13:21)&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Matt&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;There’s a lot to unpack, and we’re in the early innings. I think we’re still in the hangover of the text-box paradigm—so much of the default for generative UI has been chat. I feel like we’re starting to enter the second chapter of that, which is what excites me about our agents launch. We’ve had it internally for a while. For those who haven’t seen it, it’s the ability to use an agent directly on the infinite canvas.&lt;/p&gt;&lt;p&gt;It’s funny—a lot of what’s old is new again in LLM and ML land. We’ve reinvented evals, which are basically unit tests. We’ve reinvented prompting, which is basically user input. And design in the AI era is still governed by the same core principles. One of the core principles for me is the design diamond—divergent thinking and then convergent thinking. Most design problems follow that shape. Brainstorming is about generating ideas, not shooting them down.&lt;/p&gt;&lt;p&gt;One thing we haven’t fully unlocked yet from these new capabilities is the ability to supercharge generative thinking. We get stuck in our own lived experience and approach problems from a single angle. The value of a teammate is that they have a totally different starting point, and the creativity comes from that collision—“Oh, I hadn’t thought about it from that angle. Let me take that and build on it.”&lt;/p&gt;&lt;p&gt;So what does this mean in the new AI world? If we get outside text boxes—which are very linear, very “this then that”—and onto the canvas, the agents can enable divergent thinking. You have a frame: try grayscale. Another frame: try sepia. The sepia’s interesting but the type is wrong. Duplicate and try again. Now the accessibility’s off. And so on.&lt;/p&gt;&lt;p&gt;That’s still fairly early-stage—it’s the human driving all the input. But I think where we’re headed is an agent that throws a bunch of frames on the canvas and says, “Your job is to push these in different directions, not just double down on one.” And then a separate convergent agent that looks at 25 frames of concepts for a new marketing page and clusters them—“These three are similar, these are grouped around this”—and you can ask it for an opinion: if I’m a customer clicking through, which one makes the most sense? We haven’t really tapped any of that yet. Even the best command-line agents don’t have those workflows. That’s where I see the future of design and product thinking.&lt;/p&gt;&lt;p&gt;&lt;em&gt;(00:16:30)&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Dan&lt;/strong&gt; &lt;/p&gt;&lt;p&gt;That makes total sense. From what I can tell so far, agents are really good for: “I have a design system, I need a new landing page in that design system—go.” Which, honestly, a lot of designers don’t want to spend time on—the nth landing page or the nth graphic for a post. That’s convergent. What about the question of external agents versus building your own, or having both—which you do have?&lt;/p&gt;&lt;p&gt;&lt;em&gt;(00:17:39)&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Matt&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;We embrace both. Design workflows and engineering workflows are different, but the lines are blurring. In the future we’re all going to be builders—it’s just a question of which angle you’re coming from. We definitely support third-party agents today, and our answer for that is our MCP server. One of the nice things about MCP is that it provides a standardized interface across all these different kinds of tools.&lt;/p&gt;&lt;p&gt;We think about the problem in two directions. The first is code-to-design. A common scenario: you have a signup page but it doesn’t support GDPR. Most people aren’t going to start from a greenfield and reimagine the entire flow—they log in Monday morning and think, I just need to add the checkbox. So for that workflow, if you’re comfortable in Codex or Claude or Cursor or Windsurf, you pull up your codebase, fire up the MCP server, and ask it: “Go to this page, fire up the dev server, and copy it into Figma.” And it will actually do it. We released that earlier this year. It’s a little mind-blowing that agents can do it faithfully—but they can. You’ve removed all the drudgery and you’ve got the design into a medium where you can interact with it precisely.&lt;/p&gt;&lt;p&gt;The second direction is design-to-code. We have a tool called Get Design Context, which takes a Figma design, wraps up all the properties and components you’re using plus any guidelines you’ve set in your design library, and provides it to the agent. The agent can look at your codebase, make a branch, create a PR, make the changes—and you can even ask it to take a screenshot and attach it to the PR. Your job is like what you described with email: you’re not merging blindly, but you have a solid starting point to riff on.&lt;/p&gt;&lt;p&gt;&lt;em&gt;(00:19:36)&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Dan&lt;/strong&gt; &lt;/p&gt;&lt;p&gt;What have you learned about what makes for a good internal agent experience—inside a product—that you might not have known before the Figma Agent launch?&lt;/p&gt;&lt;p&gt;&lt;em&gt;(00:19:45)&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Matt&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;Specifically for Figma: context and personalization matter enormously. In a lot of AI products I’ve worked on in the past, personalization is often the last thing you get to—you just get it working for everyone first. But I think the difference between an okay agent and one that people genuinely love is personalization. We talked about memory as a form of it in third-party chat agents. For Figma, the equivalent is the design system. If you have an assistant but it doesn’t understand how you structure your designs and how you put them together, what it creates just isn’t usable.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Dan&lt;/strong&gt; &lt;/p&gt;&lt;p&gt;I don’t know what your plans are around Figma being more proactive—being a proactive agent—but I’m curious how that’s going, to the extent you can share. We’ve talked about how hard it is to get right.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Matt&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;That’s where the future is going, if you look at how agents have evolved. We’ve got a lot of things cooking internally that I can’t speak to specifically. But I can talk about the problems we see today. If the amount of software in the world is really exploding, one of the bigger challenges becomes: how do you make sure it’s consistent with your values? We become the bottleneck—we only have so many human eyes to review all of this work. How do we provide a solution that lets people keep innovating at the speed agents create, while maintaining their values?&lt;/p&gt;&lt;p&gt;&lt;em&gt;(00:21:36)&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Dan&lt;/strong&gt; &lt;/p&gt;&lt;p&gt;What has the transition been like internally at Figma—in the engineering org, the product org, the design org—from a pre-AI world to now?&lt;/p&gt;&lt;p&gt;&lt;em&gt;(00:22:09)&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Matt&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;I joined in January, and even in that short window it’s been night and day. In January, people were experimenting with new ways of working across all the functions—engineering was probably leading the way, as it usually does in these cases. But I’ll give you an example from the product org. We had an offsite—I think you actually came by, small world. One of my favorite memories from that offsite was what our product operations team built. They called it PMOS.&lt;/p&gt;&lt;p&gt;To take a step back: one of the big unlocks I’ve found with AI is that you start to realize every problem is a context problem. The work becomes about framing the problem with the right set of information. Our product operations team had this insight: a lot of the work we do as PMs lives in structured data. Why don’t we aggregate it? Start with the org chart—throw it in a SQLite table. Create a connector to Asana. Connect Slack, GitHub, a few other things.&lt;/p&gt;&lt;p&gt;Then the real insight: skills had really taken off at this point, and one they were excited about was onboarding file creation. When you add a new team member, as a manager you have to create a customized document—here are the channels you should know, here are the people you should know. That knowledge used to feel like it lived entirely in your head. But once you shape the context right, the data was already there. You have the org chart. The agent can walk it and figure out who’s on the team, who the trifecta is on the product-engineering-design side. You just tell it: here’s the new person, here’s the team they’re joining. It does a bunch of research, goes into Slack, figures out the relevant channels, reads the last 30 days of content, checks the Asana board, finds all the projects. And it comes back with something that’s uncannily good. A genuinely strong starting point.&lt;/p&gt;&lt;p&gt;&lt;em&gt;(00:24:03)&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Dan&lt;/strong&gt; &lt;/p&gt;&lt;p&gt;That’s one of the things I think made Claude Code so good, and what makes Codex so good right now. Everyone initially tried to build agents that lived in the cloud and were always on—but then you had to manually connect them to everything. Claude Code is just an agent on your computer with access to everything you have access to, and that completely changes what it can do because it can get all the context it needs.&lt;/p&gt;&lt;p&gt;Same with Codex—I can ask it a random question. We published an article today, and I asked it, “Who should I send this to?” It went through my emails and texts—I didn’t even realize it had access to all of that—and found five people I probably would have forgotten but should have sent it to. That’s the sort of magical thing that’s starting to happen. The AI itself would have been capable of this for a while if you gave it all the context—but it’s only now that it’s in the right harness and form factor, and can do it a little more independently than before.&lt;/p&gt;&lt;p&gt;&lt;em&gt;(00:25:12)&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Matt&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;I want to put a plea out there. At WWDC—I think it was ‘25, Apple Intelligence—I was all in. I upgraded my iPad, I was like, “This is going to be it.” They had this concept of: our phones have all of this personal data. And then it just... wasn’t it. I’m really hoping WWDC this year actually is it, because the technology has been there. The part that’s missing is tying it all together. The mobile phone ecosystem has all that content. I’m waiting for the always-on Siri that actually runs in the background and is smart, rather than “What was that? I didn’t understand you.” One day.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Dan&lt;/strong&gt; &lt;/p&gt;&lt;p&gt;Do you think they’re going to get that right? And if they don’t, does it matter?&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Matt&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;I think it still matters, because even being late to the game, they are the king of context. And Google has also, interestingly, seemed to wake up to that at Google I/O this year—they don’t have as much data as Apple, but they have a lot. It seems like they’re now starting to marry their AI products. I think Spark is supposedly the always-on agent that’s going to be auto-connected to all of your Google content. I’m waiting for the day it just runs my inbox for me and I get to inbox zero.&lt;/p&gt;&lt;p&gt;&lt;em&gt;(00:27:03)&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Dan&lt;/strong&gt; &lt;/p&gt;&lt;p&gt;I just have this feeling about Apple—when OpenAI’s Codex took off, everyone started buying Mac Minis, and you think, what a great business. They don’t even have to be in the AI race because they win by default—they make the hardware everything runs on. And even if they’re behind on Apple Intelligence, which they are, their software has historically lagged their hardware. Because the hardware is so good, they have a lot of time to catch up.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Matt&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;Their strategy is smart on the privacy angle too. It is genuinely concerning to upload all your information to the cloud. I think they’re in the game—I’m really hoping they’ve got something interesting this year.&lt;/p&gt;&lt;p&gt;&lt;em&gt;(00:27:51)&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Dan&lt;/strong&gt; &lt;/p&gt;&lt;p&gt;Looking back over the last year, there’s been this big sea change in how we build things, how good the tools are, how software works. What do you expect over the next year as capabilities keep increasing—both in how you make stuff and what you make?&lt;/p&gt;&lt;p&gt;&lt;em&gt;(00:28:18)&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Matt&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;The big thing this year will be about review. That’s where the bottleneck is now. We have agents capable of producing all of this stuff—they’re available enough, cheap enough—and now we’re being inundated with net new content. Not summaries of existing stuff; that’s been around for a while. This is: do you want me to go or not? And people are getting overwhelmed by it. We have to solve the problem of how we scale our value system—how we evaluate whether this new thing the agent created is actually good—and feel confident enough in that to let it run in auto mode.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Dan&lt;/strong&gt; &lt;/p&gt;&lt;p&gt;Do you have any sense of how that will work inside Figma, or what the interesting design considerations are for that kind of review flow?&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Matt&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;That’s one of the problems we’re really focused on—talking to customers, figuring it out. I think the industry is trying to understand what the new format is. Is it a recorded video walkthrough? Screenshots? Another agent with a different prompt that reviews the work, one that you trust so much you approve its decisions? It’s hard to predict, especially right now.&lt;/p&gt;&lt;p&gt;&lt;em&gt;(00:29:48)&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Dan&lt;/strong&gt; &lt;/p&gt;&lt;p&gt;One last question. There’s been a lot of back and forth over the last year or two about whether there’s a future for PMs, whether there’s a future for designers. If you want to be a PM, how do you break into the industry now? Maybe there are fewer PM seats, or engineers feel they don’t need PMs. How do you think about career progression for a PM—how someone who isn’t senior gets to where you are?&lt;/p&gt;&lt;p&gt;&lt;em&gt;(00:30:24)&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Matt&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;The fundamentals still matter. The best analogy I’ve seen is math class—you still had a calculator, but we all learned long division. We all learned to take derivatives by hand. Do I do that daily now? Absolutely not. But I think it’s incredibly important to understand those concepts and be able to do them by hand—to drive these systems well, you need to understand what’s underneath.&lt;/p&gt;&lt;p&gt;I’d be genuinely curious what CS 101 looks like now. There are two parallel worlds. One where you just dump your question into ChatGPT and get back, “Here are the 42 implementations of bubble sort—which one do you want?” And another where you’re a really curious person. You write the bubble sort in C, then you ask the model to compile it to assembly and explain it line by line—what’s a register, what’s L1 cache, what’s L2 cache. The people who can’t leverage these tools are the ones who just accept the output. The people who invent the next set of tools and push them to their maximum are the ones who are pushing the boundaries and understand how they’re put together. And to do that, you have to be curious. You can’t be the one who just said, “Give me the answer.” You have to be the person asking, “How does this actually work? Help me understand the next level.”&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Dan&lt;/strong&gt; &lt;/p&gt;&lt;p&gt;I agree. And it’s so much more fun to live that way.&lt;/p&gt;&lt;p&gt;&lt;em&gt;(00:32:15)&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Matt&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;It’s catnip for me. I don’t know if you’re a Hitchhiker’s Guide to the Galaxy person, but LLMs feel like the book—the literal manifestation of it. I have this on airplanes: I don’t run local LLMs often, but I’ll download an 8B model and run it offline, and it’s exactly that. You ask it “Why is the sky blue?” and it breaks down the refraction. You ask it “What is a squirrel?” and it answers that too. They’re not perfect—some are a little weird at the 8B size—but it’s a magical time to be alive for curious people.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Dan&lt;/strong&gt; &lt;/p&gt;&lt;p&gt;I totally agree. Matt, it was a pleasure.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Matt&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;Thanks.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;&lt;strong&gt;&lt;em&gt;&lt;a href="https://every.to/@danshipper" rel="noopener noreferrer" target="_blank"&gt;Dan Shipper&lt;/a&gt;&lt;/em&gt;&lt;/strong&gt; &lt;em&gt;is the cofounder and CEO of Every, where he writes the&lt;/em&gt; &lt;em&gt;&lt;a href="https://every.to/chain-of-thought" rel="noopener noreferrer" target="_blank"&gt;Chain of Thought&lt;/a&gt;&lt;/em&gt; &lt;em&gt;column and hosts the podcast&lt;/em&gt; &lt;a href="https://open.spotify.com/show/5qX1nRTaFsfWdmdj5JWO1G" rel="noopener noreferrer" target="_blank"&gt;AI &amp;amp; I&lt;/a&gt;. &lt;em&gt;You can follow him on X at&lt;/em&gt; &lt;em&gt;&lt;a href="https://twitter.com/danshipper" rel="noopener noreferrer" target="_blank"&gt;@danshipper&lt;/a&gt;&lt;/em&gt; &lt;em&gt;and on&lt;/em&gt; &lt;em&gt;&lt;a href="https://www.linkedin.com/in/danshipper/" rel="noopener noreferrer" target="_blank"&gt;LinkedIn&lt;/a&gt;. &lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;To read more essays like this, subscribe to &lt;u&gt;&lt;a href="https://every.to/subscribe" rel="noopener noreferrer" target="_blank"&gt;Every&lt;/a&gt;&lt;/u&gt;, and follow us on X at &lt;u&gt;&lt;a href="http://twitter.com/every" rel="noopener noreferrer" target="_blank"&gt;@every&lt;/a&gt;&lt;/u&gt; and on &lt;u&gt;&lt;a href="https://www.linkedin.com/company/everyinc/" rel="noopener noreferrer" target="_blank"&gt;LinkedIn&lt;/a&gt;&lt;/u&gt;.&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;For sponsorship opportunities, reach out to sponsorships@every.to.&lt;/em&gt;&lt;/p&gt;&lt;div class="quill-button" data-source="{&amp;quot;id&amp;quot;:&amp;quot;quill-button-1769187301610&amp;quot;,&amp;quot;url&amp;quot;:&amp;quot;https://every.to/subscribe?source=post_button&amp;quot;,&amp;quot;text&amp;quot;:&amp;quot;Subscribe&amp;quot;}" id="quill-button-1769187301610"&gt;&lt;a href="https://every.to/subscribe?source=post_button"&gt;Subscribe&lt;/a&gt;&lt;/div&gt;</description>
      <author>Dan Shipper / AI &amp; I</author>
      <pubDate>2026-06-03 18:00:00 -0400</pubDate>
      <guid>https://every.to/podcast/figma-exec-on-why-the-saaspocalypse-is-a-goldmine</guid>
      <link>https://every.to/podcast/figma-exec-on-why-the-saaspocalypse-is-a-goldmine</link>
    </item>
    <item>
      <title>The Eight Levels of AI Adoption</title>
      <description>&lt;table&gt;&lt;tr&gt;&lt;td&gt;&lt;img alt="Guides" src="https://d24ovhgu8s7341.cloudfront.net/uploads/publication/logo/107/small_Guides_cover.png" /&gt;&lt;/td&gt;&lt;td&gt;&lt;/td&gt;&lt;td&gt;&lt;table&gt;&lt;tr&gt;&lt;td&gt;by &lt;a href="https://every.to/@mike_2114" itemprop="name"&gt;Mike Taylor&lt;/a&gt;, &lt;a href="https://every.to/@laura_27bbaf_1" itemprop="name"&gt;Laura Entis&lt;/a&gt;, and &lt;a href="https://every.to/@claude_17b3bd_1" itemprop="name"&gt;Claude &lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;in &lt;a href="https://every.to/guides"&gt;Guides&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;p&gt;All it takes is one viral post to make you feel like you’re using AI all wrong. Someone is running 12 Claude Code sessions in parallel. Someone else’s agent is answering emails while they sleep. Meanwhile, you’re still arguing with ChatGPT.&lt;/p&gt;&lt;p&gt;But here’s the thing: Keeping up with every power user isn’t the point. The best way to find value in AI is to use it in a way that fits your work—and to regularly check in to see if you could be getting more from it than you already are. (I was using &lt;strong&gt;&lt;u&gt;&lt;a href="https://steve-yegge.medium.com/welcome-to-gas-town-4f25ee16dd04" rel="noopener noreferrer" target="_blank"&gt;Steve Yegge&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;&lt;u&gt;&lt;a href="https://steve-yegge.medium.com/welcome-to-gas-town-4f25ee16dd04" rel="noopener noreferrer" target="_blank"&gt;’s “Gas Town”&lt;/a&gt;&lt;/u&gt; post about directing dozens of coding agents to illustrate this in client presentations, but it didn’t quite match with my experience, and I needed to modify it.)&lt;/p&gt;&lt;p&gt;This guide maps eight levels of AI adoption, from basic chatbot use to full agent orchestration. With each new level, you delegate more of your work to—and place more trust in—the AI. The following sections explain how each level works in practice, complete with sample prompts, so you can figure out which levels match your current needs and workflows, what’s possible at each stage, and when it’s time to move to the next one.&lt;/p&gt;&lt;table&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td data-row="row-fk5n" data-guide-table-header="true"&gt;Level&lt;/td&gt;&lt;td data-row="row-fk5n" data-guide-table-header="true"&gt;Description&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td data-row="row-0hr5"&gt;&lt;strong&gt;1—Chatbot&lt;/strong&gt;&lt;/td&gt;&lt;td data-row="row-0hr5"&gt;You give it a task, it provides a response. (ChatGPT, Claude, Gemini)&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td data-row="row-rntw"&gt;&lt;strong&gt;2—Copilot&lt;/strong&gt;&lt;/td&gt;&lt;td data-row="row-rntw"&gt;The AI exists inside your files and completes work alongside you. (Cursor, Claude in Excel, Gemini in Google Docs)&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td data-row="row-ntko"&gt;&lt;strong&gt;3—Agent&lt;/strong&gt;&lt;/td&gt;&lt;td data-row="row-ntko"&gt;You describe a task, and the agent executes it step by step, asking for your approval before moving on. (Cowork, Codex)&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td data-row="row-kduu"&gt;&lt;strong&gt;4—Autopilot&lt;/strong&gt;&lt;/td&gt;&lt;td data-row="row-kduu"&gt;You skip approvals and let an agent complete a task on its own, then review the results. (Lovable, Codex, Claude Code)&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td data-row="row-uey6"&gt;&lt;strong&gt;5—Workflows&lt;/strong&gt;&lt;/td&gt;&lt;td data-row="row-uey6"&gt;You build a system that professionalizes the agent’s output. (Compound engineering, Claude Workflows, Copilot AI Studio)&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td data-row="row-ixqo"&gt;&lt;strong&gt;6—Assistant&lt;/strong&gt;&lt;/td&gt;&lt;td data-row="row-ixqo"&gt;The agent works proactively in the background without being prompted. (OpenClaw, Hermes Agent, Claude Managed Agents)&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td data-row="row-3pmo"&gt;&lt;strong&gt;7—Multi-agent&lt;/strong&gt;&lt;/td&gt;&lt;td data-row="row-3pmo"&gt;You’re managing multiple long-running agents at the same time. (Claude Managed Agents, OpenClaw, or Codex Goals)&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td data-row="row-3tcx"&gt;&lt;strong&gt;8—Orchestrator&lt;/strong&gt;&lt;/td&gt;&lt;td data-row="row-3tcx"&gt;A manager agent runs a team of sub-agents on your behalf. (Gas Town, Paperclip, Symphony)&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;p&gt;A higher level isn’t necessarily better. The most sophisticated AI users I know operate at several levels at once, identifying the best level to work within based on the specific challenge in front of them. The right level for a task is generally determined by how much you trust the AI to do a good job without intervention—and how big a deal it’ll be if it does mess up. For high-stakes tasks, you should either stay at a lower level so you can supervise the AI, or be prepared to invest the time, engineering resources, and tokens necessary to get that same quality at a higher level with less human oversight. &lt;/p&gt;&lt;p&gt;Most people I talk to who are struggling to adopt AI have good reasons: The output quality is either too low for the work they do or it’s too expensive to achieve. Safely moving up to the next level requires effort and experimentation—or a jump in model capability.&lt;/p&gt;&lt;p&gt;The right level match for most of your tasks may also depend on your role. Broadly speaking, the sweet spot for knowledge workers right now falls somewhere between Levels 1 and 4. Engineers are more often in Levels 5 through 8, partly because they can build the scaffolding that makes newer, less stable systems usable before they’re ready for everyone else.&lt;/p&gt;&lt;p data-guide-block-kind="agent-buttons" data-guide-block-id="guide-block-1780413079165-o66j2o"&gt;&lt;br&gt;&lt;/p&gt;&lt;h2&gt;The levels&lt;/h2&gt;&lt;h3&gt;Level 1—Chatbot&lt;/h3&gt;&lt;div class="quill-block-image" id="quill-block-image-1780410254623-pnvds0z2t" data-source="{&amp;quot;dom_id&amp;quot;:&amp;quot;quill-block-image-1780410254623-pnvds0z2t&amp;quot;,&amp;quot;link&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4283/optimized_95d80989-6191-4ae0-b82f-9b84d15a92e6.png&amp;quot;,&amp;quot;image&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4283/optimized_95d80989-6191-4ae0-b82f-9b84d15a92e6.png&amp;quot;,&amp;quot;caption&amp;quot;:null,&amp;quot;error&amp;quot;:null}"&gt;&lt;div&gt;&lt;a href="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4283/optimized_95d80989-6191-4ae0-b82f-9b84d15a92e6.png" target="_blank" rel="noopener noreferrer"&gt;&lt;img src="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4283/optimized_95d80989-6191-4ae0-b82f-9b84d15a92e6.png" alt="Uploaded image"&gt;&lt;/a&gt;&lt;/div&gt;&lt;/div&gt;&lt;p&gt;&lt;strong&gt;What it is:&lt;/strong&gt; You ask, it answers. This is the classic chatbot experience: ChatGPT, Claude, Gemini, or any other model that’s not embedded in your files or your systems. You give it a task, and it returns a response.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;What changes at this level:&lt;/strong&gt; You move from doing everything yourself to drafting and synthesizing with an always-available AI generalist.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;What you can use it for:&lt;/strong&gt; Writing from rough notes, summarizing documents, or answering questions about uploaded files&lt;/p&gt;&lt;h4&gt;Try it:&lt;/h4&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1780410177188-nnu4en"&gt;I need to send a post-meeting follow-up email to a client. Here are my rough notes, the decisions we made, and two risks we need to flag. Draft the email in a calm, confident tone and end with three clear next steps. Tell me if anything sounds unclear or unsupported before you start writing.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Input:&lt;/strong&gt; Meeting notes&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Output:&lt;/strong&gt; A polished email draft that identifies if there’s any missing information that still needs to be filled in&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Human judgment:&lt;/strong&gt; Confirm that the tone and facts are right, and the email’s content is something you stand behind.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1780410200020-pda45d"&gt;I am uploading a 20-page PDF on our new benefits policy. Summarize the five changes employees will care about the most, and then answer these three questions: Who is affected, what specific policies does the new timeline impact, and what would likely confuse someone who is reading this quickly?&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Input:&lt;/strong&gt; A PDF or set of documents&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Output:&lt;/strong&gt; A summary and direct answers to your questions grounded in the source material&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Human judgment:&lt;/strong&gt; Verify the summary is factual, and that the model recognizes when the material is ambiguous.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;&lt;strong&gt;When to move up:&lt;/strong&gt; Chatbots can assist with a wide variety of tasks, but each session requires manual setup: You have to explain what you want, provide the necessary context, and transfer the chatbot’s response to wherever you’re getting work done. Consider moving to the next level if you get a lot of value from chatbot exchanges but are tired of copy and pasting. &lt;/p&gt;&lt;h3&gt;Level 2—Copilot&lt;/h3&gt;&lt;div class="quill-block-image" id="quill-block-image-1780410278457-a0usbb18l" data-source="{&amp;quot;dom_id&amp;quot;:&amp;quot;quill-block-image-1780410278457-a0usbb18l&amp;quot;,&amp;quot;link&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4283/optimized_943babb7-2e8e-41e7-b187-f404d05d89b1.png&amp;quot;,&amp;quot;image&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4283/optimized_943babb7-2e8e-41e7-b187-f404d05d89b1.png&amp;quot;,&amp;quot;caption&amp;quot;:null,&amp;quot;error&amp;quot;:null}"&gt;&lt;div&gt;&lt;a href="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4283/optimized_943babb7-2e8e-41e7-b187-f404d05d89b1.png" target="_blank" rel="noopener noreferrer"&gt;&lt;img src="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4283/optimized_943babb7-2e8e-41e7-b187-f404d05d89b1.png" alt="Uploaded image"&gt;&lt;/a&gt;&lt;/div&gt;&lt;/div&gt;&lt;p&gt;&lt;strong&gt;What it is:&lt;/strong&gt; The model is embedded inside the place where you’re already doing work and has access to everything in your document, spreadsheet, presentation, notes app, or code editor.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;What changes at this level:&lt;/strong&gt; AI stops being a separate tab and becomes an in-place collaborator that can extend, revise, and interpret the work you’re doing as you do it.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;What you can use it for:&lt;/strong&gt; Revising drafts, understanding a document set or workspace without manually pasting everything into a chat window, and making changes to a live spreadsheet without leaving the file&lt;/p&gt;&lt;h4&gt;Try it:&lt;/h4&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1780410707976-6wrbu8"&gt;Using the draft already in this doc, write the next two sections in the same voice. Keep the tone consistent with the existing text, preserve existing structure, and flag any areas where you need examples or evidence from me before you get started.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Input:&lt;/strong&gt; An unfinished document, memo, or social media post&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Output:&lt;/strong&gt; A continuation of your draft that matches the existing material&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Human judgment:&lt;/strong&gt; Decide whether the new sections sound like you wrote them, and then determine whether they successfully advanced your argument.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;p data-guide-block-label="Prompt (Claude in Excel)" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1780410731889-nnovch"&gt;Here is our cash flow projection for Q2. Update the monthly totals with these new numbers, flag any months where we are projected to go negative, and add a summary row at the bottom with the full-quarter picture.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Input:&lt;/strong&gt; A spreadsheet with your existing cash flow data. The new figures you want incorporated can be pasted directly into the prompt or provided as a second file.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Output:&lt;/strong&gt; Updated monthly cash flow figures, a list of months where the cash flow is projected to be negative, and a summary of projected cash flow for the entire quarter&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Human judgment:&lt;/strong&gt; Verify the formulas are correct, check that the summary is accurate, and determine what strategies you’d like to take for addressing months with a projected negative cash flow.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;&lt;strong&gt;When to move up:&lt;/strong&gt; Copilot removes the need to manually provide context, but it can only reliably access information from a single file. Consider moving to the next level if you need to pull, compile, or analyze information across multiple sources.&lt;/p&gt;&lt;h3&gt;Level 3—Agent&lt;/h3&gt;&lt;div class="quill-block-image" id="quill-block-image-1780410292780-9l4vq085f" data-source="{&amp;quot;dom_id&amp;quot;:&amp;quot;quill-block-image-1780410292780-9l4vq085f&amp;quot;,&amp;quot;link&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4283/optimized_82509d33-0272-448e-82e0-b515d33f0233.png&amp;quot;,&amp;quot;image&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4283/optimized_82509d33-0272-448e-82e0-b515d33f0233.png&amp;quot;,&amp;quot;caption&amp;quot;:null,&amp;quot;error&amp;quot;:null}"&gt;&lt;div&gt;&lt;a href="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4283/optimized_82509d33-0272-448e-82e0-b515d33f0233.png" target="_blank" rel="noopener noreferrer"&gt;&lt;img src="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4283/optimized_82509d33-0272-448e-82e0-b515d33f0233.png" alt="Uploaded image"&gt;&lt;/a&gt;&lt;/div&gt;&lt;/div&gt;&lt;p&gt;&lt;strong&gt;What it is:&lt;/strong&gt; You describe a task, and the agent works step by step to complete it, checking in with you for approval along the way. It can access your files and systems, perform actions on your computer, and compile information from multiple sources.&lt;/p&gt;&lt;p&gt;One key distinction worth keeping in mind: An agent in this context is reactive. It waits for you to initiate and will not start a task unless you explicitly tell it to.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;What changes at this level:&lt;/strong&gt; AI becomes a true operator capable of executing multi-step tasks with supervision.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;What you can use it for:&lt;/strong&gt; Using figures from one file to update another, or building something new—like a dashboard—from a set of source documents&lt;/p&gt;&lt;h4&gt;Try it:&lt;/h4&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1780410758577-gmb1qx"&gt;Take the Q4 revenue numbers from this spreadsheet and update the board deck with the new figures, charts, and commentary. Show me the proposed edits slide by slide before you apply them, and call out anywhere the source data seems inconsistent.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Input:&lt;/strong&gt; A spreadsheet and a presentation deck&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Output:&lt;/strong&gt; Proposed slide updates tied to specific data&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Human judgment:&lt;/strong&gt; Confirm that the interpretation of the data is how you’d like to present it, correct any factual or contextual issues the agent might have missed, and approve the changes.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1780410781426-mcg0ur"&gt;Using the NPS data in this file, build a simple dashboard I can open in a browser. I want to track overall score, key themes in the comments, and how responses break down by segment. Before you build it, tell me how you plan to structure it and what assumptions you are making about the data.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Input:&lt;/strong&gt; A data file and a dedicated folder the agent can work within&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Output:&lt;/strong&gt; A working dashboard, along with a detailed plan for how it built it plus a summary of the assumptions it made about the data&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Human judgment:&lt;/strong&gt; Approve the plan, confirm the dashboard works the way you want it to, and determine whether any assumptions the agent made about the data need to be revised.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;&lt;strong&gt;When to move up:&lt;/strong&gt; With an agent, the process is iterative—the agent completes a step, you review and refine, wash, repeat. Consider moving to the next step when you want to relinquish control in exchange for speed or the ability to one-shot a prototype without writing any code.&lt;/p&gt;&lt;h3&gt;Level 4—Autopilot&lt;/h3&gt;&lt;div class="quill-block-image" id="quill-block-image-1780410308560-fzsk3vgpw" data-source="{&amp;quot;dom_id&amp;quot;:&amp;quot;quill-block-image-1780410308560-fzsk3vgpw&amp;quot;,&amp;quot;link&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4283/optimized_3088f1c9-89df-4e7b-97cd-026e933a6d03.png&amp;quot;,&amp;quot;image&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4283/optimized_3088f1c9-89df-4e7b-97cd-026e933a6d03.png&amp;quot;,&amp;quot;caption&amp;quot;:null,&amp;quot;error&amp;quot;:null}"&gt;&lt;div&gt;&lt;a href="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4283/optimized_3088f1c9-89df-4e7b-97cd-026e933a6d03.png" target="_blank" rel="noopener noreferrer"&gt;&lt;img src="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4283/optimized_3088f1c9-89df-4e7b-97cd-026e933a6d03.png" alt="Uploaded image"&gt;&lt;/a&gt;&lt;/div&gt;&lt;/div&gt;&lt;p&gt;&lt;strong&gt;What it is:&lt;/strong&gt; You skip permissions and let an agent complete a task on its own, then review the results. With an agent, you stay involved in the process because you care how each step gets done. On autopilot, which is often called &lt;u&gt;&lt;a href="https://every.to/working-overtime/it-s-me-hi-i-m-the-vibe-coder" rel="noopener noreferrer" target="_blank"&gt;vibe coding&lt;/a&gt;&lt;/u&gt;, you &lt;/p&gt;&lt;p&gt;describe what you want, let the system run, and evaluate what comes back. At this stage, you’re typically building something other users will interact with, such as a prototype or landing page.&lt;/p&gt;&lt;p&gt;Determining which tasks can be done on autopilot depends on how capable the model is, a calculation that changes with every release. For example, I’ll happily produce a landing page on autopilot, because the models are good enough to make one that meets my standards. I can’t do the same with a complex slide deck, at least not yet—the result is so far from what I want correcting it takes longer than doing it myself. As the models improve, you can get away with doing more of your work on autopilot. &lt;/p&gt;&lt;p&gt;&lt;strong&gt;What changes at this level:&lt;/strong&gt; You hand over the entire task to the model and review the end result instead of revising along the way.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;What you can use it for:&lt;/strong&gt; Building prototypes, internal tools, and first-pass products. Autopilot is the first level that allows you to build something other people can use without having to write a line of code yourself. It can also usually cover routine tasks, such as filling out recurring forms or drafting weekly status reports. &lt;/p&gt;&lt;h4&gt;Try it:&lt;/h4&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1780410795476-ctr1mb"&gt;Build me a lightweight internal lead-scoring tool for our sales team. It should let us paste in account notes, assign a score from 1 to 5, and show which factors drove the score. Use dummy data for now and make the interface clean enough that I can demo it tomorrow.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Input:&lt;/strong&gt; A plain-English description of what the tool should do, who will use it, and any constraints, such as whether it needs to work in a browser or stay local&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Output:&lt;/strong&gt; A functioning prototype&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Human judgment:&lt;/strong&gt; Test the output and decide if it’s demo-ready. A prototype doesn’t need to be perfect, but it’s worth noting where you’d need to invest in reliability before putting it in front of users.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1780410818693-8ylyc8"&gt;Build a landing page for our new feature. It should explain what the feature does, include a clear call to action, and match the tone and brand colors of our existing site. Make it responsive.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Input:&lt;/strong&gt; A product brief, brand guidelines, and the existing site as a reference. Brand guidelines can be as simple as a color palette and a few sentences about tone; if you don’t have a formal document, describing your existing site in a sentence or two is enough to get started.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Output:&lt;/strong&gt; A working landing page&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Human judgment:&lt;/strong&gt; Read the copy, test the page on mobile, and decide whether it’s ready to share more widely.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;&lt;strong&gt;When to move up:&lt;/strong&gt; Autopilot is fast, but it often produces uneven or unreliable results. That might be fine for a prototype, but for higher-stakes work, you’ll want to build a repeatable system around the agent that structures its thinking and execution. Consider moving to the next level if you want the speed and versatility of autopilot with more structured quality control. &lt;/p&gt;&lt;h3&gt;Level 5—Workflows&lt;/h3&gt;&lt;div class="quill-block-image" id="quill-block-image-1780410319302-xt6jfkzr7" data-source="{&amp;quot;dom_id&amp;quot;:&amp;quot;quill-block-image-1780410319302-xt6jfkzr7&amp;quot;,&amp;quot;link&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4283/optimized_50985b3b-17a4-4072-a6df-9e0572270492.png&amp;quot;,&amp;quot;image&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4283/optimized_50985b3b-17a4-4072-a6df-9e0572270492.png&amp;quot;,&amp;quot;caption&amp;quot;:null,&amp;quot;error&amp;quot;:null}"&gt;&lt;div&gt;&lt;a href="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4283/optimized_50985b3b-17a4-4072-a6df-9e0572270492.png" target="_blank" rel="noopener noreferrer"&gt;&lt;img src="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4283/optimized_50985b3b-17a4-4072-a6df-9e0572270492.png" alt="Uploaded image"&gt;&lt;/a&gt;&lt;/div&gt;&lt;/div&gt;&lt;p&gt;&lt;strong&gt;What it is:&lt;/strong&gt; You build a system, or harness, around your agent that professionalizes its output. Instead of a one-shot run, your agent plans, reviews, performs confidence checks, and runs code through other safeguards to make the results more reliable. This is a transition from vibe coding to agentic engineering. The pace is still fast, but because you have structured the process and included guardrails that catch and fix mistakes, the output is of a higher quality.&lt;/p&gt;&lt;p&gt;This level is primarily the domain of engineers. Reviewing a plan, evaluating which tests need to be done, and designing the harness that keeps the agent from going off the rails all require an understanding of what’s happening under the hood. The &lt;u&gt;&lt;a href="https://every.to/source-code/compound-engineering-the-definitive-guide" rel="noopener noreferrer" target="_blank"&gt;compound engineering guide&lt;/a&gt;&lt;/u&gt; covers this in detail. Much of this discipline will be baked into platforms over the next six to 12 months; for now, it requires technical judgment to implement.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;What changes at this level:&lt;/strong&gt; You stop treating the agent as a one-shot performer. By designing a repeatable process for the agent to follow—and encoding your standards into that process—you can trust the agent with work you’d otherwise want to do by hand.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;What you can use it for:&lt;/strong&gt; Shipping features with a plan-review-implement loop, turning a vibe coded prototype into something stable enough for production, or building a process other engineers on the team can follow&lt;/p&gt;&lt;h4&gt;Try it:&lt;/h4&gt;&lt;p&gt;Run &lt;code&gt;/plan&lt;/code&gt; (plan mode in Claude Code, or &lt;code&gt;/ce-plan&lt;/code&gt; in compound engineering) before writing any code.&lt;/p&gt;&lt;p&gt;&lt;code&gt;/ce-plan&lt;/code&gt; Inspect this repo and propose a plan for adding a customer support inbox view. Include the &lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1780410870192-7lar5c"&gt;files you expect to touch, edge cases, and how you will verify the behavior. Wait for my approval before implementing.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Input:&lt;/strong&gt; A codebase the agent has access to, and a written feature request or specification. The more context you can give upfront—existing architecture patterns, relevant files, known constraints—the better the plan will be.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Output:&lt;/strong&gt; A plan for building a feature that you can review before the agent implements anything&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Human judgment:&lt;/strong&gt; Evaluate the plan and make any necessary improvements before having your agent implement it.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;After the agent finishes a change, run &lt;code&gt;/ce-code-review&lt;/code&gt; (or ask it to review its own work).&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1780410920444-zfnyf7"&gt;Review this change like a skeptical teammate would. Tell me how confident you are from 1 to 100, list the weakest parts of the implementation, and make another pass until you are above 90 or can clearly explain why you are not.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Input:&lt;/strong&gt; The completed change—a diff, a set of modified files, or a pull request—plus the original spec or plan the agent was working from, so the review can check whether the implementation matches the instructions &lt;/p&gt;&lt;p&gt;&lt;strong&gt;Output:&lt;/strong&gt; A self-review, confidence score, and an improved version of the feature&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Human judgment:&lt;/strong&gt; Decide whether the confidence score is justified and whether you agree with the review. If the agent rates itself highly but you identify issues it didn’t flag, name them and have it do another pass.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;&lt;strong&gt;When to move up:&lt;/strong&gt; Even the most sophisticated workflows require you to activate them, which, for certain tasks, becomes a bottleneck. Consider moving to the next level if there are areas of your life or work you’d trust an agent to handle without checking in with you first. (At this stage of model development, that’s more often lower-stakes administrative or household tasks.) &lt;/p&gt;&lt;h3&gt;Level 6—Assistant&lt;/h3&gt;&lt;div class="quill-block-image" id="quill-block-image-1780410326986-7rmyy8x1u" data-source="{&amp;quot;dom_id&amp;quot;:&amp;quot;quill-block-image-1780410326986-7rmyy8x1u&amp;quot;,&amp;quot;link&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4283/optimized_aedc1a8d-12f9-4e2c-8832-c97a2a02d24a.png&amp;quot;,&amp;quot;image&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4283/optimized_aedc1a8d-12f9-4e2c-8832-c97a2a02d24a.png&amp;quot;,&amp;quot;caption&amp;quot;:null,&amp;quot;error&amp;quot;:null}"&gt;&lt;div&gt;&lt;a href="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4283/optimized_aedc1a8d-12f9-4e2c-8832-c97a2a02d24a.png" target="_blank" rel="noopener noreferrer"&gt;&lt;img src="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4283/optimized_aedc1a8d-12f9-4e2c-8832-c97a2a02d24a.png" alt="Uploaded image"&gt;&lt;/a&gt;&lt;/div&gt;&lt;/div&gt;&lt;p&gt;&lt;strong&gt;What it is:&lt;/strong&gt; Unlike an agent—which waits for you to tell it to do something—an assistant acts on your behalf without being prompted. It can monitor a domain, do recurring work, and surface relevant information around the clock. For example, OpenClaw’s heartbeat.md file triggers every half an hour with instructions around priorities, and the agent takes action automatically. No need to prompt.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;What changes at this level:&lt;/strong&gt; AI moves from providing reactive help to proactive, ongoing support.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;What you can use it for:&lt;/strong&gt; Recurring research, monitoring a topic you care about, or personal administrative work that would otherwise fall through the cracks&lt;/p&gt;&lt;p&gt;This level still requires either technical knowledge or access to someone who can walk you through the onboarding process and fix your assistant when it breaks. On the consulting team, we have an AI assistant that handles all project management and sales pipeline-related tasks, but it only &lt;u&gt;&lt;a href="https://every.to/p/what-i-learned-onboarding-our-ai-project-manager" rel="noopener noreferrer" target="_blank"&gt;reliably functions&lt;/a&gt;&lt;/u&gt; because it’s maintained by Every senior engineer &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@nityesh" rel="noopener noreferrer" target="_blank"&gt;Nityesh Agarwal&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;. &lt;/p&gt;&lt;p&gt;OpenClaw is the most popular platform for personal AI assistants, but it’s inherently unstable  and time-intensive to set up. Is memory problem hasn’t been solved yet, so it can struggle to retain context between sessions.&lt;/p&gt;&lt;p&gt;Lower-stakes personal uses, such as monitoring your inbox for emails from your child’s school or tracking household purchases, are more accessible with the current state of available models than giving an assistant access to your work systems, which requires engineering and IT support to do safely. Risk tolerance matters here more than at any earlier level.&lt;/p&gt;&lt;h4&gt;Try it:&lt;/h4&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1780410954294-1x1f31"&gt;Every 30 minutes, check my calendar and flag events taking place within the next two hours that require preparation. If there is a meeting with no agenda, draft a short suggested one based on its title and attendees.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Input:&lt;/strong&gt; Calendar access and your preferences about what events qualify as requiring prep—for example, whether you want to be flagged for one-on-ones, external calls, or anything over 30 minutes. Output is typically delivered to a messaging app like Slack, although the specific setup depends on which platform you’re using. &lt;/p&gt;&lt;p&gt;&lt;strong&gt;Output:&lt;/strong&gt; A recurring brief delivered to your message app of choice&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Human judgment:&lt;/strong&gt; Decide what is urgent, and refine the rules based on the results.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1780410971627-ktovru"&gt;Monitor my inbox for emails from my child’s school. Each morning, give me a short summary of anything I need to know or act on. Also keep a running log of recent grocery purchases and let me know when we are running low on staples.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Input:&lt;/strong&gt; Access to your calendar, inbox, and receipts  &lt;/p&gt;&lt;p&gt;&lt;strong&gt;Output:&lt;/strong&gt; A daily brief and a running household inventory&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Human judgment:&lt;/strong&gt; Verify that the summary captures the most important information and the agent is accurately identifying grocery items that need to be restocked. &lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;&lt;strong&gt;When to move up:&lt;/strong&gt; When set up correctly, an always-on assistant can proactively handle a wide variety of tasks. Consider moving to the next level if you want your assistant to accomplish even more for you, but don’t want to interrupt its existing workflow or are worried about overburdening its memory. &lt;/p&gt;&lt;h3&gt;Level 7—Multi-agent&lt;/h3&gt;&lt;div class="quill-block-image" id="quill-block-image-1780410334393-jcebi9qez" data-source="{&amp;quot;dom_id&amp;quot;:&amp;quot;quill-block-image-1780410334393-jcebi9qez&amp;quot;,&amp;quot;link&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4283/optimized_8acad988-d418-40d4-96b7-63e1918f0b2f.png&amp;quot;,&amp;quot;image&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4283/optimized_8acad988-d418-40d4-96b7-63e1918f0b2f.png&amp;quot;,&amp;quot;caption&amp;quot;:null,&amp;quot;error&amp;quot;:null}"&gt;&lt;div&gt;&lt;a href="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4283/optimized_8acad988-d418-40d4-96b7-63e1918f0b2f.png" target="_blank" rel="noopener noreferrer"&gt;&lt;img src="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4283/optimized_8acad988-d418-40d4-96b7-63e1918f0b2f.png" alt="Uploaded image"&gt;&lt;/a&gt;&lt;/div&gt;&lt;/div&gt;&lt;p&gt;&lt;strong&gt;What it is:&lt;/strong&gt; You are managing multiple long-running agents or assistants at the same time. Each one has a role, a task, or an area of responsibility, and your work starts to look more like leading a small team. This level is firmly in senior engineering territory—it is rare for knowledge workers to be running multiple parallel agent sessions.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;What changes at this level:&lt;/strong&gt; Your productivity multiplies when you move from one agent doing a task to having several agents working on tasks in parallel.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;What you can use it for:&lt;/strong&gt; Running implementation and planning simultaneously, or automating recurring investigation work so it no longer requires your direct attention&lt;/p&gt;&lt;h4&gt;Try it:&lt;/h4&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1780410999804-wbkom3"&gt;You already have one always-on agent—perhaps a custom Claude agent that runs on its own Mac Mini—that handles your editorial work. Rather than interrupt its workflow to have it complete an unrelated task, you set up a second agent that is responsible for a different job function: &lt;em&gt;“You’re responsible for our customer support inbox. Triage new tickets as they come in, draft replies for the routine ones, and flag anything that needs a human.”&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Input:&lt;/strong&gt; A custom long-running agent with its own scope, tools, and memory, kept separate from the first so their contexts don’t bleed together&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Output:&lt;/strong&gt; Two agents working in parallel with distinct job functions, skills, and memory&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1780411485754-nd8l9n"&gt;Systematically review each agent’s work to determine whether it’s executing at the level you need it to and its job description is focused enough that its memory isn’t getting overburdened.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Input:&lt;/strong&gt; A bug-reporting system connected to an agent trigger&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Output:&lt;/strong&gt; A steady stream of pull requests, each tied to a specific reported issue&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Human judgment:&lt;/strong&gt; Review each pull request, merge the approved ones, and identify cases where the agent misdiagnosed the problem.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;&lt;strong&gt;When to move up:&lt;/strong&gt; Long-running agents are valuable because they can largely work independently, but you still need to set their goals and evaluate their progress. Consider moving to the next level when you have so many of these agents that you lose track of which one is responsible for what. &lt;/p&gt;&lt;h3&gt;Level 8—Orchestrator&lt;/h3&gt;&lt;div class="quill-block-image" id="quill-block-image-1780410341010-7jpus9owk" data-source="{&amp;quot;dom_id&amp;quot;:&amp;quot;quill-block-image-1780410341010-7jpus9owk&amp;quot;,&amp;quot;link&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4283/optimized_73dabd59-1756-4842-9a90-1df893c9e224.png&amp;quot;,&amp;quot;image&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4283/optimized_73dabd59-1756-4842-9a90-1df893c9e224.png&amp;quot;,&amp;quot;caption&amp;quot;:null,&amp;quot;error&amp;quot;:null}"&gt;&lt;div&gt;&lt;a href="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4283/optimized_73dabd59-1756-4842-9a90-1df893c9e224.png" target="_blank" rel="noopener noreferrer"&gt;&lt;img src="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4283/optimized_73dabd59-1756-4842-9a90-1df893c9e224.png" alt="Uploaded image"&gt;&lt;/a&gt;&lt;/div&gt;&lt;/div&gt;&lt;p&gt;&lt;strong&gt;What it is:&lt;/strong&gt; An orchestrator agent manages a team of agents. It plans, delegates, monitors progress, and consolidates outputs so you can focus on bigger-picture tasks, such as setting overall goals or reviewing major decisions. Tools like &lt;u&gt;&lt;a href="https://steve-yegge.medium.com/welcome-to-gas-town-4f25ee16dd04" rel="noopener noreferrer" target="_blank"&gt;Gas Town&lt;/a&gt;&lt;/u&gt;, &lt;u&gt;&lt;a href="https://github.com/paperclipai/paperclip" rel="noopener noreferrer" target="_blank"&gt;Paperclip&lt;/a&gt;&lt;/u&gt;, and &lt;u&gt;&lt;a href="https://github.com/openai/symphony" rel="noopener noreferrer" target="_blank"&gt;Symphony&lt;/a&gt;&lt;/u&gt; (from OpenAI) are early examples of this model.&lt;/p&gt;&lt;p&gt;It’s critical to note that this level is highly experimental. Even engineers operating at the frontier still largely fill the role of orchestrator themselves rather than trusting an orchestrator agent to handle complex coordination work.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;What changes at this level:&lt;/strong&gt; You stop managing each individual agent and instead focus on setting goals, establishing constraints, and implementing approval thresholds.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;What you can use it for:&lt;/strong&gt; Projects where the economics only make sense if you remove yourself as the bottleneck—building a system for keeping track of who’s doing what, sequencing work across multiple agents, and making sure the right issues are escalated without you &lt;/p&gt;&lt;h4&gt;Try it:&lt;/h4&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1780411037211-5lteo6"&gt;An always-on agent takes the next ticket in the queue from your project management software. &lt;em&gt;“Your job is to design a landing page for this SEO keyword [insert keyword]. Break the research up into parallel search queries related to the topic, search our company documents for unique insights, then write up a full page using the /brand-style skill. ” &lt;/em&gt;Agents continue to take and complete tickets until the board is clear, and the fully completed project is ready for human review.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Input:&lt;/strong&gt; A high-level objective, defined agent roles, and rules for what requires human review&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Output:&lt;/strong&gt; A managed project where you receive critical updates instead of raw output from every agent that’s running in parallel&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Human judgment:&lt;/strong&gt; Determine whether the orchestrator is doing a good job triaging issues or if too many—or too few—are being handed over for you to review. &lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1780411056860-pbqxcs"&gt;Set up a pipeline that reviews each code submission against our codebase standards, runs the tests, checks for common issues, and escalates issues to me only when they require a judgment call.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Input:&lt;/strong&gt; A repository, contribution guidelines, and a test suite&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Output:&lt;/strong&gt; A short queue of escalated items that need your input, instead of hundreds of raw submissions that require manual triage&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Human judgment:&lt;/strong&gt; Establish the threshold for what qualifies as something that needs your attention, and raise or lower the bar as needed. The agent can flag those tasks for human review, or it can work on the entire project autonomously until all tests pass and the agent has recorded a video of the software working end to end. &lt;/p&gt;&lt;h2&gt;What the levels measure&lt;/h2&gt;&lt;p&gt;There is no value judgment baked into these levels. The vast majority of people should not pursue orchestration, for example, because the models aren’t reliable enough for most use cases. That said, as the technology improves, it can be worth revisiting a level that was previously inaccessible to you or your company. Model releases can pull everyone up, making tools and systems more reliable and easier to use.&lt;/p&gt;&lt;p&gt;If you take anything away from this guide, let it be this: AI use is not a competition. You wouldn’t brag that you had eight interns working overnight on a key project, and you hadn’t checked their output. Instead, you’d work clo&lt;/p&gt;&lt;p&gt;sely with them for months until you were confident they had enough training to be able to work autonomously. Expect to put in a similar amount of effort with your agents before you can trust them to get reliable results at the next level of autonomy. Determining which levels fit your specific needs—rather than seeing how far you can ascend for the sake of it—is the most important thing you can do if you want to make better use of the technology. &lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;&lt;strong&gt;&lt;em&gt;&lt;u&gt;&lt;a href="https://every.to/@mike_2114" rel="noopener noreferrer" target="_blank"&gt;Mike Taylor&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;&lt;/strong&gt;&lt;em&gt; is the head of &lt;u&gt;&lt;a href="https://every.to/consulting" rel="noopener noreferrer" target="_blank"&gt;tech consulting&lt;/a&gt;&lt;/u&gt; at Every and a co-author of &lt;/em&gt;&lt;u&gt;&lt;a href="https://www.oreilly.com/library/view/prompt-engineering-for/9781098153427/" rel="noopener noreferrer" target="_blank"&gt;Prompt Engineering for Generative AI&lt;/a&gt;&lt;/u&gt; (O’Reilly)&lt;em&gt;. Learn more about how Every’s consulting team can &lt;u&gt;&lt;a href="https://every.to/consulting?utm_source=emailfooter" rel="noopener noreferrer" target="_blank"&gt;bring AI into your organization&lt;/a&gt;&lt;/u&gt;.&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;strong&gt;&lt;em&gt;&lt;u&gt;&lt;a href="https://every.to/@laura_27bbaf_1" rel="noopener noreferrer" target="_blank"&gt;Laura Entis&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;&lt;/strong&gt;&lt;em&gt; is a staff writer at Every. You can follow her on &lt;u&gt;&lt;a href="https://www.linkedin.com/in/lauraentis/" rel="noopener noreferrer" target="_blank"&gt;LinkedIn&lt;/a&gt;&lt;/u&gt;.&lt;/em&gt;&lt;/p&gt;</description>
      <author>Mike Taylor, Laura Entis, and Claude  / Guides</author>
      <pubDate>2026-06-02 18:00:00 -0400</pubDate>
      <guid>https://every.to/guides/the-eight-levels-of-ai-adoption</guid>
      <link>https://every.to/guides/the-eight-levels-of-ai-adoption</link>
    </item>
    <item>
      <title>Where Do You Fall on the Eight Levels of AI Adoption?</title>
      <description>&lt;table&gt;&lt;tr&gt;&lt;td&gt;&lt;/td&gt;&lt;td&gt;&lt;table&gt;&lt;tr&gt;&lt;td&gt;by &lt;a href="https://every.to/@mike_2114" itemprop="name"&gt;Mike Taylor&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;figure&gt;&lt;img src="https://d24ovhgu8s7341.cloudfront.net/uploads/post/cover/4284/full_page_cover_73ff78874a104984-cover_image.png"&gt;&lt;figcaption&gt;Midjourney/Every illustration.&lt;/figcaption&gt;&lt;/figure&gt;&lt;p&gt;&lt;em&gt;Was this newsletter forwarded to you? &lt;u&gt;&lt;a href="https://every.to/account" rel="noopener noreferrer" target="_blank"&gt;Sign up&lt;/a&gt;&lt;/u&gt; to get it in your inbox.&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;All it takes is one viral post to make you feel like you’re using AI all wrong. Someone’s running 12 Claude Code sessions in parallel. Someone else’s agent answers emails while they sleep. Meanwhile, you’re still arguing with ChatGPT.&lt;/p&gt;&lt;p&gt;Here’s the thing: Keeping up with the power users isn’t the point. The best way to get value from AI is to use it in a way that fits your work—and to check in now and then to see whether you could be getting more from it. &lt;/p&gt;&lt;p&gt;With that in mind, today we published a &lt;a href="https://every.to/guides/the-eight-levels-of-ai-adoption?source=post_button" rel="noopener noreferrer" target="_blank"&gt;guide&lt;/a&gt; that maps all eight levels of AI adoption, from chatbot basics to full agent orchestration. We explain how each level works in practice, with sample prompts, so you can figure out which ones match your current needs and workflows, what’s possible at each stage, and when it’s time to move to the next one. &lt;/p&gt;&lt;ul&gt;&lt;li&gt;&lt;strong&gt;Level 1—Chatbot: &lt;/strong&gt;You ask, it answers.&lt;/li&gt;&lt;li&gt;&lt;strong&gt;Level 2—Copilot:&lt;/strong&gt; The AI works alongside you, inside your files.&lt;/li&gt;&lt;li&gt;&lt;strong&gt;Level 3—Agent:&lt;/strong&gt; It executes a task step by step, checking in for approval.&lt;/li&gt;&lt;li&gt;&lt;strong&gt;Level 4—Autopilot:&lt;/strong&gt; It runs on its own; you review the result.&lt;/li&gt;&lt;li&gt;&lt;strong&gt;Level 5—Workflows:&lt;/strong&gt; You build a system that makes its output more reliable.&lt;/li&gt;&lt;li&gt;&lt;strong&gt;Level 6—Assistant: &lt;/strong&gt;It works in the background, without being prompted.&lt;/li&gt;&lt;li&gt;&lt;strong&gt;Level 7—Multi-agent:&lt;/strong&gt; You manage several long-running agents at once.&lt;/li&gt;&lt;li&gt;&lt;strong&gt;Level 8—Orchestrator:&lt;/strong&gt; A manager agent runs a team of sub-agents for you.&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;A higher level isn’t necessarily better. The right level for a task is generally determined by how much you trust the AI to do a good job without intervention, and how big a deal it’ll be if it does mess up.&lt;/p&gt;&lt;p&gt;If you want to know where you fall on the AI adoption spectrum—and whether it’s time to experiment with higher levels—&lt;a href="https://every.to/guides/the-eight-levels-of-ai-adoption?source=post_button" rel="noopener noreferrer" target="_blank"&gt;this guide&lt;/a&gt; is for you. &lt;/p&gt;&lt;div class="quill-button" data-source="{&amp;quot;id&amp;quot;:&amp;quot;quill-button-1780412770281&amp;quot;,&amp;quot;text&amp;quot;:&amp;quot;Read the 8 levels guide&amp;quot;,&amp;quot;url&amp;quot;:&amp;quot;https://every.to/guides/the-eight-levels-of-ai-adoption?source=post_button&amp;quot;}" id="quill-button-1780412770281"&gt;&lt;a href="https://every.to/guides/the-eight-levels-of-ai-adoption?source=post_button"&gt;Read the 8 levels guide&lt;/a&gt;&lt;/div&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;&lt;strong&gt;&lt;em&gt;&lt;u&gt;&lt;a href="https://every.to/@mike_2114" rel="noopener noreferrer" target="_blank"&gt;Mike Taylor&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;&lt;/strong&gt; &lt;em&gt;is the head of tech consulting at Every and a co-author of &lt;/em&gt;&lt;u&gt;&lt;a href="https://www.oreilly.com/library/view/prompt-engineering-for/9781098153427/" rel="noopener noreferrer" target="_blank"&gt;Prompt Engineering for Generative AI&lt;/a&gt;&lt;/u&gt; (O’Reilly)&lt;em&gt;.&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;To read more essays like this, subscribe to &lt;u&gt;&lt;a href="https://every.to/subscribe" rel="noopener noreferrer" target="_blank"&gt;Every&lt;/a&gt;&lt;/u&gt;, and follow us on X at &lt;u&gt;&lt;a href="http://twitter.com/every" rel="noopener noreferrer" target="_blank"&gt;@every&lt;/a&gt;&lt;/u&gt; and on &lt;u&gt;&lt;a href="https://www.linkedin.com/company/everyinc/" rel="noopener noreferrer" target="_blank"&gt;LinkedIn&lt;/a&gt;&lt;/u&gt;.&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;We &lt;u&gt;&lt;a href="https://every.to/studio" rel="noopener noreferrer" target="_blank"&gt;build AI tools&lt;/a&gt;&lt;/u&gt; for readers like you. Write brilliantly with &lt;/em&gt;&lt;strong&gt;&lt;em&gt;&lt;u&gt;&lt;a href="https://writewithspiral.com/" rel="noopener noreferrer" target="_blank"&gt;Spiral&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;&lt;/strong&gt;&lt;em&gt;. Organize files automatically with &lt;/em&gt;&lt;strong&gt;&lt;em&gt;&lt;u&gt;&lt;a href="https://makeitsparkle.co/?utm_source=everyfooter" rel="noopener noreferrer" target="_blank"&gt;Sparkle&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;&lt;/strong&gt;&lt;em&gt;. Deliver yourself from email with &lt;/em&gt;&lt;strong&gt;&lt;em&gt;&lt;u&gt;&lt;a href="https://cora.computer/" rel="noopener noreferrer" target="_blank"&gt;Cora&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;&lt;/strong&gt;&lt;em&gt;. Dictate effortlessly with &lt;/em&gt;&lt;strong&gt;&lt;em&gt;&lt;u&gt;&lt;a href="https://monologue.to/" rel="noopener noreferrer" target="_blank"&gt;Monologue&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;&lt;/strong&gt;&lt;em&gt;. Collaborate with agents on documents with &lt;/em&gt;&lt;strong&gt;&lt;em&gt;&lt;a href="https://www.proofeditor.ai/" rel="noopener noreferrer" target="_blank"&gt;Proof&lt;/a&gt;&lt;/em&gt;&lt;/strong&gt;&lt;em&gt;. &lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;We also do AI training, adoption, and innovation for companies. &lt;u&gt;&lt;a href="https://every.to/consulting?utm_source=emailfooter" rel="noopener noreferrer" target="_blank"&gt;Work with us&lt;/a&gt;&lt;/u&gt; to bring AI into your organization.&lt;/em&gt;&lt;/p&gt;&lt;div class="quill-button" data-source="{&amp;quot;id&amp;quot;:&amp;quot;quill-button-1769187301610&amp;quot;,&amp;quot;url&amp;quot;:&amp;quot;https://every.to/subscribe?source=post_button&amp;quot;,&amp;quot;text&amp;quot;:&amp;quot;Subscribe&amp;quot;}" id="quill-button-1769187301610"&gt;&lt;a href="https://every.to/subscribe?source=post_button"&gt;Subscribe&lt;/a&gt;&lt;/div&gt;</description>
      <author>Mike Taylor</author>
      <pubDate>2026-06-02 07:00:00 -0400</pubDate>
      <guid>https://every.to/p/where-do-you-fall-on-the-eight-levels-of-ai-adoption</guid>
      <link>https://every.to/p/where-do-you-fall-on-the-eight-levels-of-ai-adoption</link>
    </item>
    <item>
      <title>Company-wide AI Implementation in Five Steps</title>
      <description>&lt;table&gt;&lt;tr&gt;&lt;td&gt;&lt;/td&gt;&lt;td&gt;&lt;table&gt;&lt;tr&gt;&lt;td&gt;by &lt;a href="https://every.to/@natalia_2944" itemprop="name"&gt;Natalia Quintero&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;figure&gt;&lt;img src="https://d24ovhgu8s7341.cloudfront.net/uploads/post/cover/4282/full_page_cover_e22a781eeacaf44b-monday_s_piece.png"&gt;&lt;figcaption&gt;Midjourney/Every illustration.&lt;/figcaption&gt;&lt;/figure&gt;&lt;p&gt;&lt;em&gt;Join me and &lt;/em&gt;&lt;strong&gt;&lt;em&gt;Dan Shipper&lt;/em&gt;&lt;/strong&gt;&lt;em&gt; for a live session on what AI fluency looks like at the executive level tomorrow, Tuesday, &lt;/em&gt;&lt;strong&gt;&lt;em&gt;June 2&lt;/em&gt;&lt;/strong&gt;&lt;em&gt;. We’ll walk through how the leaders we work with—at hedge funds, private equity firms, and Fortune 500 companies—are using AI in their day-to-day, and what they wish they’d done differently six months in. &lt;a href="https://every.to/events/executive-AI-sessions" rel="noopener noreferrer" target="_blank"&gt;RSVP&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;Was this newsletter forwarded to you? &lt;u&gt;&lt;a href="https://every.to/account" rel="noopener noreferrer" target="_blank"&gt;Sign up&lt;/a&gt;&lt;/u&gt; to get it in your inbox.&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;Sitting across from the chief operating officer of a health tech company earlier this year, I watched her name a problem many executives are feeling but few say out loud.&lt;/p&gt;&lt;p&gt;“Our junior employees are probably much more native with this technology,” she said. “And we need to make sure we’re sticking with it. Makes me feel like a dinosaur to say that, but it’s true.” &lt;/p&gt;&lt;p&gt;Confessions like this come up regularly during our executive training sessions: Leaders aren’t working directly with AI on sophisticated tasks, even as they’re guiding planning decisions about the technology. They know they &lt;em&gt;should&lt;/em&gt; spend more time learning the tools, but they haven’t committed to it yet. That’s understandable; executives are incredibly busy. But what we see in our sessions is that leaders who haven’t gotten their hands dirty don’t clearly understand the practical opportunities and challenges of AI. That health tech executive’s admission sparked an important conversation about how a coordinated company-wide approach to AI implementation starts with executive AI fluency—but doesn’t stop there. &lt;/p&gt;&lt;p&gt;We see this pattern in every engagement we run in our consulting work. Over the past two years, we’ve trained thousands of people at companies including the&lt;em&gt; New York Times&lt;/em&gt;, Ripple, Headway, and Thumbtack, and at investment firms managing over $100 billion in assets. We’ve done the workshops and watched what changed six months later.&lt;/p&gt;&lt;p&gt;AI usage in the workplace is now widespread, but it’s an altogether different ballgame to build organizational capability that truly realizes financial gains. &lt;/p&gt;&lt;p&gt;McKinsey &lt;u&gt;&lt;a href="https://www.mckinsey.com/capabilities/quantumblack/our-insights/the-state-of-ai" rel="noopener noreferrer" target="_blank"&gt;defines&lt;/a&gt;&lt;/u&gt; AI high performers as organizations that report both significant value from AI and more than a 5 percent impact on earnings before interest and taxes (EBIT). These companies are nearly three times as likely as others to have fundamentally redesigned their workflows, but they remain a minority: Only 6 percent of the nearly 2,000 organizations surveyed met the criteria for success.&lt;/p&gt;&lt;p&gt;As AI has gone from performing party tricks to completing an entire day’s worth of human work in three short years, enterprise AI adoption has moved through three distinct waves. First came the license wave: companies bought access to tools like ChatGPT, Claude, and Microsoft Copilot and waited for productivity gains to appear. Then came the prompt wave: companies ran training sessions, built prompt libraries, and encouraged teams to experiment with custom GPTs. Now we are entering the implementation wave: prompt libraries are giving way to skills libraries, agents, evals, and workflows with named owners.&lt;/p&gt;&lt;p&gt;The METR chart in our full guide shows how far the technology has progressed, but we’ve seen that many organizations implementing AI haven’t kept up with the sea change. The bottleneck for AI adoption has moved from model capability to &lt;em&gt;organizational&lt;/em&gt; capability.&lt;/p&gt;&lt;p&gt;That’s why we built a &lt;u&gt;&lt;a href="https://every.to/guides/an-executive-s-guide-to-implementing-ai" rel="noopener noreferrer" target="_blank"&gt;practical guide for executives&lt;/a&gt;&lt;/u&gt; who have bought AI tools but are not yet seeing real value from them. The loop is simple:&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Get fluent. &lt;/strong&gt;Use the tools yourself before directing anyone else to use them. Know what your company has access to, what the policies allow, and what the friction feels like. If you haven’t built something with AI in the last 30 days, start there.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Assign AI champions.&lt;/strong&gt; Pick operators with bandwidth. Give them protected time (at least two days per month), a clear mandate, and enablement. They are responsible for taking workflows from “works in a demo” to “works in production.”&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Pick one painful workflow. &lt;/strong&gt;Let your champions choose. They know what work is most tedious and worth automating. Start with something frequent, data-rich, and narrow enough to test in a week. You don’t need a full workflow mapping exercise.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Build to 95 percent. &lt;/strong&gt;An automation that works 80 percent of the time is a demo. Real automation requires gold-standard examples, structured evals, human review gates, and a named owner who maintains it when the model updates. Once you have a skill that works reliably 90-95 percent of the time, you’ve gotten value from AI. &lt;/p&gt;&lt;p&gt;&lt;strong&gt;Scale what works.&lt;/strong&gt; This is where the champion role is key. Run show-and-tells. Train adjacent teams on proven workflows. Kill what doesn’t work and expand what does. One visible win creates pull across the organization.&lt;/p&gt;&lt;p&gt;&lt;a href="https://every.to/guides/an-executive-s-guide-to-implementing-ai" rel="noopener noreferrer" target="_blank"&gt;This guide&lt;/a&gt; turns that loop into a 60-day plan for executives, with checklists and rubrics drawn from Every’s consulting work with dozens of top companies. You can read it in full here.&lt;/p&gt;&lt;div class="quill-button" data-source="{&amp;quot;id&amp;quot;:&amp;quot;quill-button-1780324561463&amp;quot;,&amp;quot;text&amp;quot;:&amp;quot;Read the AI for executives guide&amp;quot;,&amp;quot;url&amp;quot;:&amp;quot;https://every.to/guides/an-executive-s-guide-to-implementing-ai?source=post_button&amp;quot;}" id="quill-button-1780324561463"&gt;&lt;a href="https://every.to/guides/an-executive-s-guide-to-implementing-ai?source=post_button"&gt;Read the AI for executives guide&lt;/a&gt;&lt;/div&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;&lt;strong&gt;&lt;em&gt;&lt;u&gt;&lt;a href="https://every.to/@natalia_2944" rel="noopener noreferrer" target="_blank"&gt;Natalia Quintero&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;&lt;/strong&gt;&lt;em&gt; is the head of &lt;u&gt;&lt;a href="https://every.to/consulting" rel="noopener noreferrer" target="_blank"&gt;Every Consulting&lt;/a&gt;&lt;/u&gt;.&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;Thanks to &lt;/em&gt;&lt;strong&gt;&lt;em&gt;&lt;a href="https://www.linkedin.com/in/tommatsuda/" rel="noopener noreferrer" target="_blank"&gt;Tom Matsuda&lt;/a&gt;&lt;/em&gt;&lt;/strong&gt;&lt;em&gt; for editorial support.&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;To read more essays like this, subscribe to &lt;u&gt;&lt;a href="https://every.to/subscribe" rel="noopener noreferrer" target="_blank"&gt;Every&lt;/a&gt;&lt;/u&gt;, and follow us on X at &lt;u&gt;&lt;a href="http://twitter.com/every" rel="noopener noreferrer" target="_blank"&gt;@every&lt;/a&gt;&lt;/u&gt; and on &lt;u&gt;&lt;a href="https://www.linkedin.com/company/everyinc/" rel="noopener noreferrer" target="_blank"&gt;LinkedIn&lt;/a&gt;&lt;/u&gt;.&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;For sponsorship opportunities, reach out to sponsorships@every.to.&lt;/em&gt;&lt;/p&gt;&lt;div class="quill-button" data-source="{&amp;quot;id&amp;quot;:&amp;quot;quill-button-1769187301610&amp;quot;,&amp;quot;url&amp;quot;:&amp;quot;https://every.to/subscribe?source=post_button&amp;quot;,&amp;quot;text&amp;quot;:&amp;quot;Subscribe&amp;quot;}" id="quill-button-1769187301610"&gt;&lt;a href="https://every.to/subscribe?source=post_button"&gt;Subscribe&lt;/a&gt;&lt;/div&gt;</description>
      <author>Natalia Quintero</author>
      <pubDate>2026-06-01 06:00:00 -0400</pubDate>
      <guid>https://every.to/p/company-wide-ai-implementation-in-five-steps</guid>
      <link>https://every.to/p/company-wide-ai-implementation-in-five-steps</link>
    </item>
    <item>
      <title>An Executive’s Guide to Implementing AI</title>
      <description>&lt;table&gt;&lt;tr&gt;&lt;td&gt;&lt;img alt="Guides" src="https://d24ovhgu8s7341.cloudfront.net/uploads/publication/logo/107/small_Guides_cover.png" /&gt;&lt;/td&gt;&lt;td&gt;&lt;/td&gt;&lt;td&gt;&lt;table&gt;&lt;tr&gt;&lt;td&gt;by &lt;a href="https://every.to/@natalia_2944" itemprop="name"&gt;Natalia Quintero&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;in &lt;a href="https://every.to/guides"&gt;Guides&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;p&gt;If you read nothing else, here is the loop:&lt;/p&gt;&lt;p data-guide-block-id="guide-block-1780324157980-dev3r1"&gt;Get fluent → Assign AI champions → Pick one painful workflow → Build to 95 percent → Scale what works&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Get fluent. &lt;/strong&gt;Use the tools yourself before directing anyone else to use them. Know what your company has access to, what the policies allow, and what the friction feels like. If you haven’t built something with AI in the last 30 days, start there.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Assign AI champions.&lt;/strong&gt; Pick operators with bandwidth. Give them protected time (at least two days per month), a clear mandate, and enablement. They are responsible for taking workflows from “works in a demo” to “works in production.”&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Pick one painful workflow. &lt;/strong&gt;Let your champions choose. They know what work is most tedious and worth automating. Start with something frequent, data-rich, and narrow enough to test in a week. You don’t need a full workflow mapping exercise.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Build to 95 percent. &lt;/strong&gt;An automation that works 80 percent of the time is a demo. Real automation requires gold-standard examples, structured evals, human review gates, and a named owner who maintains it when the model updates. Once you have a skill that works reliably 90-95 percent of the time, you’ve gotten value from AI. &lt;/p&gt;&lt;p&gt;&lt;strong&gt;Scale what works.&lt;/strong&gt; This is where the champion role is key. Run show-and-tells. Train adjacent teams on proven workflows. Kill what doesn’t work and expand what does. One visible win creates pull across the organization.&lt;/p&gt;&lt;p&gt;This guide turns that loop into a 60-day plan for executives, with checklists, and rubrics drawn from Every’s consulting work with dozens of top companies.&lt;/p&gt;&lt;p data-guide-block-kind="agent-buttons" data-guide-block-id="guide-block-1780326847696-srjtds"&gt;&lt;br&gt;&lt;/p&gt;&lt;p&gt;&lt;br&gt;&lt;/p&gt;&lt;h2&gt;An executive’s guide to implementing AI&lt;/h2&gt;&lt;p&gt;Sitting across from the chief operating officer of a health tech company earlier this year, I watched her name a problem many executives are feeling but few say out loud. &lt;/p&gt;&lt;p&gt;“Our junior employees are probably much more native with this technology,” she said. “And we need to make sure we’re sticking with it. Makes me feel like a dinosaur to say that, but it’s true.” &lt;/p&gt;&lt;p&gt;Confessions like this come up regularly during our executive training sessions: Leaders aren’t working directly with AI on sophisticated tasks, even as they’re guiding planning decisions about the technology. They know they &lt;em&gt;should&lt;/em&gt; spend more time learning the tools, but they haven’t committed to it yet. That’s understandable; executives are incredibly busy. But what we see in our sessions is that leaders who haven’t gotten their hands dirty don’t clearly understand the practical opportunities and challenges of AI. That health tech executive’s admission sparked an important conversation about how a coordinated company-wide approach to AI implementation starts with executive AI fluency—but doesn’t stop there. &lt;/p&gt;&lt;p&gt;We see this pattern in every engagement we run in our consulting work. Over the past two years, we’ve trained thousands of people at companies including the&lt;em&gt; New York Times&lt;/em&gt;, Ripple, Headway, and Thumbtack, and at investment firms managing over $100 billion in assets. We’ve done the workshops and watched what changed six months later. AI usage in the workplace is now widespread, but it’s an altogether different ballgame to build organizational capability that truly realizes financial gains. &lt;/p&gt;&lt;p&gt;McKinsey &lt;u&gt;&lt;a href="https://www.mckinsey.com/capabilities/quantumblack/our-insights/the-state-of-ai" rel="noopener noreferrer" target="_blank"&gt;defines&lt;/a&gt;&lt;/u&gt; AI high performers as organizations that report both significant value from AI and more than a 5 percent impact on earnings before interest and taxes (EBIT). These companies are nearly three times as likely as others to have fundamentally redesigned their workflows, but they remain a minority: Only 6 percent of the nearly 2,000 organizations surveyed met the criteria for success. &lt;/p&gt;&lt;p&gt;Of course, no outside firm can implement AI into your company for you. But we can provide a playbook for how to build organizational capability that endures: leaders that work directly with the tools, empower the right champions, and build the muscle across teams for what great looks like, one painful workflow at a time. By the end of this guide, you’ll have no excuse not to be one of them. &lt;/p&gt;&lt;h2&gt;Riding the waves of AI adoption&lt;/h2&gt;&lt;p&gt;In three short years, AI has gone from performing party tricks to completing an entire day’s worth of human work.&lt;/p&gt;&lt;p data-guide-block-id="guide-block-1780326089253-ybdbao"&gt;In 2022, models could answer basic questions, tasks that take a human four seconds. By mid-2023, GPT-4 could handle tasks that take humans about six minutes. By late 2024, o1-preview was tackling hour-long work. And by late 2025, Claude Opus crossed into tasks that take humans 10 hours or more. That progression has been exponential and transformed what “AI implementation” means for companies again and again.&lt;/p&gt;&lt;p&gt;Here are the three rough waves of AI adoption since ChatGPT’s launch: &lt;/p&gt;&lt;ul&gt;&lt;li&gt;&lt;strong&gt;The license wave (late 2022 to early 2024):&lt;/strong&gt; Companies bought licenses for ChatGPT Enterprise, Claude, and Microsoft Copilot in the hopes that they would increase employee productivity. Some employees found value in using the tools to draft emails, summarize documents, and conduct research, but gains were uneven and individual. &lt;/li&gt;&lt;li&gt;&lt;strong&gt;The prompt wave (early 2024 to mid-2025):&lt;/strong&gt; Companies ran prompt-training sessions, created internal prompt libraries, built resource documents, and encouraged teams to experiment with custom GPTs. That helped move AI beyond pure individual tinkering, but it rarely created durable organizational change—custom GPTs and libraries often had no owner and no way to evaluate their results. &lt;/li&gt;&lt;li&gt;&lt;strong&gt;The implementation wave (mid-2025 to now):&lt;/strong&gt; Following its launch in research preview in February 2025, Claude Code helped shift enterprise adoption to where we are now: away from chat-based AI and prompt libraries and toward AI agents that can increasingly be configured to perform longer, multi-step tasks within defined constraints. Prompt libraries are giving way to skills libraries: reusable workflows with instructions, examples, reference materials, scripts, evaluation criteria, and named owners. Suddenly, non-technical people can build sophisticated automations in tools like Claude Cowork; implementation isn’t just for engineers anymore.&lt;/li&gt;&lt;/ul&gt;&lt;div class="quill-block-image" id="quill-block-image-1780327760745-5yx7abpq0" data-source="{&amp;quot;dom_id&amp;quot;:&amp;quot;quill-block-image-1780327760745-5yx7abpq0&amp;quot;,&amp;quot;link&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4281/optimized_1a0cce89-e97b-4fb4-a963-8700cef86ada.png&amp;quot;,&amp;quot;image&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4281/optimized_1a0cce89-e97b-4fb4-a963-8700cef86ada.png&amp;quot;,&amp;quot;caption&amp;quot;:&amp;quot;The chart plots each model release against the complexity of software tasks it can reliably complete, measured by how long those same tasks take a human. (Source: METR, an independent research organization that evaluates AI model capabilities on real-world tasks.)&amp;quot;,&amp;quot;error&amp;quot;:null}"&gt;&lt;div&gt;&lt;a href="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4281/optimized_1a0cce89-e97b-4fb4-a963-8700cef86ada.png" target="_blank" rel="noopener noreferrer"&gt;&lt;img src="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4281/optimized_1a0cce89-e97b-4fb4-a963-8700cef86ada.png" alt="The chart plots each model release against the complexity of software tasks it can reliably complete, measured by how long those same tasks take a human. (Source: METR, an independent research organization that evaluates AI model capabilities on real-world tasks.)"&gt;&lt;/a&gt;&lt;figcaption class="quill-image-caption"&gt;The chart plots each model release against the complexity of software tasks it can reliably complete, measured by how long those same tasks take a human. (Source: METR, an independent research organization that evaluates AI model capabilities on real-world tasks.)&lt;/figcaption&gt;&lt;/div&gt;&lt;/div&gt;&lt;p&gt;The &lt;a href="https://metr.org" rel="noopener noreferrer" target="_blank"&gt;METR&lt;/a&gt; chart shows just how far the technology has progressed, but we’ve seen that many organizations implementing AI haven’t kept up with the sea change.The bottleneck for AI adoption has moved from model capability to  chart shows just how far the technology has progressed, but we’ve seen that many organizations implementing AI haven’t kept up with the sea change. The bottleneck for AI adoption has moved from model capability to &lt;em&gt;organizational&lt;/em&gt; capability. On our end, we’ve fundamentally altered our trainings to support executives and teams in this new era. For instance, we’ve retooled our sessions on prompting into workshops on setting up agents, skills, and workflows that can be owned, tested, and maintained. We’re working with executives on building that organizational muscle and turning raw model capability into reliable, repeatable workflows.&lt;/p&gt;&lt;p&gt;We know it’s making a difference. One investment firm we worked with now runs 100-plus agents across the organization through Copilot Cowork. At an e-commer ce company client, Claude’s Opus handled financial variance analysis that previously took a week. After working with us, a private equity firm decided to hire full-time AI champions to continue their AI implementation process.&lt;/p&gt;&lt;p&gt;Here are the five steps we’ve found that can carry you and your company into the next era, too: &lt;/p&gt;&lt;h3&gt;Step #1: Get fluent&lt;/h3&gt;&lt;p&gt;AI implementation starts with executive fluency. That doesn’t mean executives need to become day-to-day AI builders. What’s important is that you spend enough time with the tools to understand what you’re asking your teams to do. At one large media and data company we worked with, we saw that executives responsible for reviewing internal AI initiatives had never built with the tools themselves. All their previous initiatives had failed. It was easy for them to project what an AI agent could do for their business. It’s much harder to wrestle with what building with AI involves: the data the agent needs, the systems it can access, where it might fail, how much human review it requires, and who will maintain it after the first demo works.&lt;/p&gt;&lt;h4&gt;Get your hands dirty&lt;/h4&gt;&lt;p&gt;In our executive sessions, we push leaders beyond using AI as a chat interface and ask them to build a custom skill, agent, or automation themselves. The exercise quickly surfaces all the practical constraints that determine whether the use of AI can create value for a specific workflow.&lt;/p&gt;&lt;p&gt;Once you start to build for yourself as an executive, the conversation moves from abstract enthusiasm to practical questions: which connectors need to be enabled, what data can be accessed, and whether existing information technology policies match the company’s AI ambitions. &lt;/p&gt;&lt;p&gt;Understanding the roles and perspectives of IT and security are a critical part of AI fluency. The goal isn’t to bypass guardrails; regulated companies may have good reasons to restrict file uploads, block certain tools, or limit which data can be passed into a model. But as a leader, you need an ongoing dialogue with IT and security teams to take into consideration what tools are available, how they connect, what data can move where, and what trade-offs the company might be making. If you’ve never built under those constraints, you may misread the resulting low adoption as employee reluctance rather than an access problem. &lt;/p&gt;&lt;h4&gt;Define your standard of excellence&lt;/h4&gt;&lt;p&gt;Fluency also exposes whether leaders can define what good work looks like. In one executive session, we worked with a leader who had to prepare metrics for the company’s board. The process required pulling data from Snowflake and took many hours each quarter.&lt;/p&gt;&lt;p&gt;On the surface, this looked like a perfect candidate for an AI skill. But as the team started building, a different issue emerged: The executive could not clearly articulate what “excellent” looked like.&lt;/p&gt;&lt;p&gt;A skill is a set of reusable capabilities that define how an agent performs tasks; in order to reliably reproduce a workflow, it needs instructions, examples, reference materials, and a clear picture of what good and bad output look like. Of course, the same is true of people. If you cannot explain your standard of excellence to your chief of staff, you’ll certainly struggle to explain it to an AI system.&lt;/p&gt;&lt;p&gt;This is why as a leader, you’re better positioned to use AI than you may think. High-performing executives already know how to set direction, allocate resources, define standards, and judge whether work is good enough. Executives do not need to have all the answers. But you do need enough AI fluency to ask the questions that will help you decide where AI belongs in your company’s strategy:&lt;/p&gt;&lt;ul&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1780327112998-z0hpy7"&gt;What can AI see inside our company?&lt;/li&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1780327112998-z0hpy7"&gt;What can it do?&lt;/li&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1780327112998-z0hpy7"&gt;Where are the constraints?&lt;/li&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1780327112998-z0hpy7"&gt;Which workflows are painful enough to prioritize?&lt;/li&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1780327112998-z0hpy7"&gt;Who will own the systems we build?&lt;/li&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1780327112998-z0hpy7"&gt;How will we know whether they work?&lt;/li&gt;&lt;/ul&gt;&lt;h3&gt;Step #2: Assign AI champions&lt;/h3&gt;&lt;p&gt;Once you understand what AI can and cannot do, the next step for executives is to assign ownership of the projects to specific individuals, known as AI champions. &lt;/p&gt;&lt;p&gt;Champions shepherd a project from initial idea to completion by experimenting with and iterating on workflows, teaching others what success looks like, and gathering support across the organization for AI implementation. Their job is to decide what gets built, what gets maintained, what gets improved, and what gets killed. &lt;/p&gt;&lt;p&gt;Champions typically have three qualities: curiosity, a people-oriented mindset, and the authority and time to do the work. As a leader, your job is to choose the best people for the job. In our consulting work, we train these champions to lead adoption across their departments.&lt;/p&gt;&lt;h4&gt;Find the people who ask questions&lt;/h4&gt;&lt;p&gt;AI champions do not need to be the most technical people in the organization. They don’t need to be engineers or have experience using AI for years. &lt;/p&gt;&lt;p&gt;They do, however, need to be curious. Great champions constantly ask questions, probe how processes work, and want to understand “what excellence looks like” for different tasks and functions. &lt;/p&gt;&lt;p&gt;Truly curious people also tend to be comfortable asking for help from colleagues and from AI—a skill we believe will define the next era of work. Successful AI champions treat AI as a partner rather than a one-shot magic button. And AI rewards people who are willing to admit what they do not know, break a problem down, ask better questions, and keep iterating until they get somewhere useful. &lt;/p&gt;&lt;h4&gt;Great champions care about people&lt;/h4&gt;&lt;p&gt;The best AI champions also understand that AI implementation is fundamentally a people issue — and they care about the people they work with. &lt;/p&gt;&lt;p&gt;AI champions build and maintain tools, but they also help colleagues change how they work. To do that well, champions need to understand the pain points inside their function. They need to know which tasks drain time, which processes frustrate people, and which handoffs create errors. That’s why the strongest champion is someone who’s close to the workflow the company is solving—a marketer who knows where campaign analysis gets stuck, for example, or a customer support lead who understands ticket triage. &lt;/p&gt;&lt;p&gt;They also need to be great communicators. Once a skill or workflow is ready for wider implementation, the champion has to explain it to the rest of the team, collect feedback, and help people understand how to use it. &lt;/p&gt;&lt;h4&gt;Give champions time and authority &lt;/h4&gt;&lt;p&gt;Champions need to be given the authority to make decisions and the time to be able to execute. This is where many AI programs fail. Executives identify enthusiastic people and ask them to help with AI on top of their day job. The result: The work gets squeezed into evenings, deprioritized during busy periods, and sometimes abandoned altogether. Enterprise AI implementation won’t work if it’s pitched as an informal side project.&lt;/p&gt;&lt;p&gt;What champions need is protected time—at least two days a month, in our experience—and a clear mandate. They should be responsible for a small number of workflows in their domain, with enough authority to make decisions about how those workflows are documented, tested, and maintained. They should also have a clear escalation path when they need support from IT, security, leadership, or another function. &lt;/p&gt;&lt;p&gt;The exact structure will vary, of course. A large company may need an ambassador model, with champions distributed across major functions. A mid-sized company will likely need one or two department champions per team. For private equity firms or holding companies, dedicated fellows who move between the firm and portfolio companies could be the most effective. &lt;/p&gt;&lt;p data-guide-block-kind="callout" data-guide-block-id="guide-block-1780327123471-v8nd96"&gt;In short, an AI champion should: &lt;/p&gt;&lt;ul&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1780327123471-v8nd96"&gt;Own one to three workflows in their domain&lt;/li&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1780327123471-v8nd96"&gt;Maintain the documentation for those workflows&lt;/li&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1780327123471-v8nd96"&gt;Build or manage eval sets&lt;/li&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1780327123471-v8nd96"&gt;Collect feedback from the team&lt;/li&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1780327123471-v8nd96"&gt;Update skills when tools, models, or processes change&lt;/li&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1780327123471-v8nd96"&gt;Report on time saved, quality improved, or errors reduced&lt;/li&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1780327123471-v8nd96"&gt;Have protected time to do the work&lt;/li&gt;&lt;/ul&gt;&lt;h3&gt;Step #3: Pick one painful workflow&lt;/h3&gt;&lt;p&gt; Once you have champions, it’s time to pick a workflow to start with. This is where most executives make the mistake that derails the process: They begin with the biggest, most visible problem at the company. &lt;/p&gt;&lt;p&gt;We’ve seen executives want to automate the creation of the board deck, rebuild project management, or create an agent that solves a hairy cross-functional process across multiple systems. But even experienced AI builders make the mistake of starting too big. At Every, one of the team’s first instincts was to automate project management for our consulting business—a broad, messy workflow touching multiple people, systems, and decisions. &lt;/p&gt;&lt;p&gt;But AI implementation works better when you resist the urge to build the “whole body” at once. Instead, start with one artery of the workflow, a narrow, painful piece of the puzzle that can be tested, improved, and then trusted before expanding from there. Good candidate workflows are often unglamorous—categorizing support tickets or summarizing vendor updates—but are frequent enough to act as valuable test cases. If you’ve chosen your champions well, you can rely on them to find the most painful workflow to start with. They may even have experienced that pain firsthand.&lt;/p&gt;&lt;p data-guide-block-kind="callout" data-guide-block-id="guide-block-1780327132243-3kgvfm"&gt;To locate the best workflow among a good group of candidates, score them against the following criteria: &lt;/p&gt;&lt;p data-guide-block-kind="callout" data-guide-block-id="guide-block-1780327132243-3kgvfm"&gt;&lt;strong&gt;Frequency: &lt;/strong&gt;Does this happen daily, weekly or monthly?&lt;/p&gt;&lt;p data-guide-block-kind="callout" data-guide-block-id="guide-block-1780327132243-3kgvfm"&gt;&lt;strong&gt;Pain:&lt;/strong&gt; How much time, frustration, or error does it create?&lt;/p&gt;&lt;p data-guide-block-kind="callout" data-guide-block-id="guide-block-1780327132243-3kgvfm"&gt;&lt;strong&gt;Data availability:&lt;/strong&gt; Is the required information already digital and accessible?&lt;/p&gt;&lt;p data-guide-block-kind="callout" data-guide-block-id="guide-block-1780327132243-3kgvfm"&gt;&lt;strong&gt;Risk:&lt;/strong&gt; What happens if the AI gets it wrong?&lt;/p&gt;&lt;p data-guide-block-kind="callout" data-guide-block-id="guide-block-1780327132243-3kgvfm"&gt;&lt;strong&gt;Ownership:&lt;/strong&gt; Who currently does this manually?&lt;/p&gt;&lt;p data-guide-block-kind="callout" data-guide-block-id="guide-block-1780327132243-3kgvfm"&gt;&lt;strong&gt;Evaluation clarity:&lt;/strong&gt; Can we tell whether the output is correct?&lt;/p&gt;&lt;p data-guide-block-kind="callout" data-guide-block-id="guide-block-1780327132243-3kgvfm"&gt;&lt;strong&gt;Maintenance burden:&lt;/strong&gt; How often will the workflow need updating?&lt;/p&gt;&lt;h3&gt;Step #4: Build to 95 percent&lt;/h3&gt;&lt;p&gt;Once you’ve chosen your first workflow (or your champion has with your blessing), it’s time to start building. This is often the moment when one of the biggest expectation gaps in AI implementation emerges. A team can often get an impressive first version of something working in minutes. Whether it’s a customer service workflow that categorizes the first 20 tickets correctly, or a vendor update that manages to capture a pricing increase or a new security requirement, that jump from zero to something workable can feel like magic.  &lt;/p&gt;&lt;p&gt;But typically, what’s happened is that they’ve built a demo, not a usable product that can be rolled out anywhere. Turning that demo into a tool the team can rely on—going from 60 percent to 95 percent—requires &lt;em&gt;much&lt;/em&gt; more work: examples, evaluation, feedback, human review, and maintenance. And champions and executives will have to work in tandem to get there. &lt;/p&gt;&lt;h4&gt;Set product standards&lt;/h4&gt;&lt;p&gt;Executives should act like tastemakers here, setting the standard for what a useful workflow looks like and where human review belongs, and deciding how much time the company is willing to invest in the outcome. Champions can then use those standards to build. They collect examples and evaluation metrics to test the workflow’s output, gathering feedback that informs the finished product. &lt;/p&gt;&lt;h4&gt;Automation is a lie&lt;/h4&gt;&lt;p&gt;Building to 95 percent also means accepting that any AI workflow is a never-ending process. Models update. Company processes shift. Team standards change. New edge cases appear. A skill that worked last month may need to be adjusted this month. This is where evals—structured ways to test whether the AI is doing its job correctly—come in. &lt;/p&gt;&lt;p&gt;Think of an AI agent less as a machine that runs forever and more as an employee you’re onboarding. You have to give it instructions, show it examples, correct its mistakes, and clarify what excellence looks like. Over time, it’ll become more useful, but only if it’s managed correctly.&lt;/p&gt;&lt;p data-guide-block-kind="callout" data-guide-block-id="guide-block-1780327141717-msk2we"&gt;For each workflow, create a simple table that asks: &lt;/p&gt;&lt;ul&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1780327141717-msk2we"&gt;What real example should the workflow be tested against? &lt;/li&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1780327141717-msk2we"&gt;What’s the current output of the AI agent? &lt;/li&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1780327141717-msk2we"&gt;What’s the expected output of the AI agent? &lt;/li&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1780327141717-msk2we"&gt;What errors is it making? &lt;/li&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1780327141717-msk2we"&gt;What caused the error?&lt;/li&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1780327141717-msk2we"&gt;Is a prompt or skill change required? &lt;/li&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1780327141717-msk2we"&gt;What’s the result of the retest? &lt;/li&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1780327141717-msk2we"&gt;Is human review required?&lt;/li&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1780327141717-msk2we"&gt;Who owns this workflow? &lt;/li&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1780327141717-msk2we"&gt;What’s the review cadence of the tool? &lt;/li&gt;&lt;/ul&gt;&lt;h3&gt;Step #5: Scale what works&lt;/h3&gt;&lt;p&gt;The next step sounds obvious, but you’d be surprised how many executives get it wrong: Only scale what works. This is important from a resource perspective, but it’s also key for internal adoption. While many executives begin with a company-wide mandate that everyone start using the tools, the better path is to foster one visible win by choosing the right champion, workflow, and standards, and building from there. &lt;/p&gt;&lt;p&gt;When a team experiences an AI workflow that solves a real and painful problem, AI stops being an abstract productivity promise and becomes a practical solution. That experience creates pull across the organization, and other teams start asking what could work for them. &lt;/p&gt;&lt;p&gt;But scaling doesn’t mean copying the same workflow everywhere. Most workflows are department-specific. What works for finance may not work for marketing, for instance, and what works for customer support may not work for the product team. Once you have a winning workflow, it’s your job as a leader to decide whether it should stay team-specific, become a shared skill, or be sunsetted. &lt;/p&gt;&lt;p&gt;But regardless of how specific its impact is, your first successful workflow can create reusable components across the company by establishing how to describe processes, document standards, and define good output. Those practices can be adopted by any team.&lt;/p&gt;&lt;p data-guide-block-kind="callout" data-guide-block-id="guide-block-1780327148435-19e36q"&gt;Before scaling a workflow, ask:&lt;/p&gt;&lt;ul&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1780327148435-19e36q"&gt;Has it solved a real pain point?&lt;/li&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1780327148435-19e36q"&gt;Has it been tested against real examples?&lt;/li&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1780327148435-19e36q"&gt;Is there a named owner?&lt;/li&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1780327148435-19e36q"&gt;Is there a review process?&lt;/li&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1780327148435-19e36q"&gt;Are the risks understood?&lt;/li&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1780327148435-19e36q"&gt;Can the team explain how and when to use it?&lt;/li&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1780327148435-19e36q"&gt;Is there a feedback loop for improvement?&lt;/li&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1780327148435-19e36q"&gt;Should this become a shared skill, stay team-specific, or be killed?&lt;/li&gt;&lt;/ul&gt;&lt;h2&gt;A 60-day plan for leader-enabled AI implementation&lt;/h2&gt;&lt;h4&gt;&lt;strong&gt;Weeks 1–2: Get fluent&lt;/strong&gt;&lt;/h4&gt;&lt;p&gt;As executives, you should dedicate time to building with AI tools and mapping access, data connectors, and security constraints. Get your IT and security teams in the room to ask questions so you can understand the tradeoffs between AI implementation and security.   &lt;/p&gt;&lt;h4&gt;&lt;strong&gt;Weeks 3–4: Assign champions and pick workflows&lt;/strong&gt;&lt;/h4&gt;&lt;p&gt;Select champions in each relevant function, and give them a clear mandate and protected time to identify a short list of painful workflows.&lt;/p&gt;&lt;h4&gt;&lt;strong&gt;Weeks 5–7: Build and evaluate&lt;/strong&gt;&lt;/h4&gt;&lt;p&gt;Work with your champions to select a starting workflow to build into a skill, agent, or automation by defining good output, building eval sets, and testing workflow to identify failure modes. &lt;/p&gt;&lt;h4&gt;&lt;strong&gt;Weeks 8–9: Scale or kill&lt;/strong&gt;&lt;/h4&gt;&lt;p&gt;If the workflow works, train the rest of the team to use it. Then, instruct champions to run a show-and-tell for adjacent teams to help decide whether the workflow should become part of a shared skills library or remain team-specific. Make a final call on whether the workflow should be scaled, and move on to the next one.&lt;/p&gt;&lt;p&gt;By the end of 60 days, it’s unlikely you’ve transformed your entire company. But you will have something valuable: at least one reliable workflow created by trained champions, a team on board with its implementation, and a repeatable process for scaling future AI work. (You would be surprised at just how rare this is. Most companies have a lot of prompts, tools, and automations that don’t get the job done.) &lt;/p&gt;&lt;h2&gt;What we’ve learned &lt;/h2&gt;&lt;p&gt;There is no simple shortcut to successful AI implementation. No single tool or model can solve every company’s problem, and no outside firm can implement AI for you, either. &lt;/p&gt;&lt;p&gt;From leading our consulting practice, I understand the time and commitment it takes to go through this implementation process. In January, I spent over 100 hours working closely with our internal AI champion, a forward deployed engineer (FDE) on our team, to define our own AI adoption. Now, I spend 10-15 percent of my time maintaining existing skills, providing feedback to agents and the FDE, and making decisions about where to apply AI and how the team should allocate time to these tools. &lt;/p&gt;&lt;p&gt;We now have a skill library that the business relies on and an agent that does the work of a full-time employee supporting project management, sales operations, and delivery. For us, that investment is worth it.&lt;/p&gt;&lt;p&gt;The health tech company from earlier learned that firsthand. When we started working with them, they were getting to grips with Claude Code. Now, the company is building its own internal AI infrastructure that’s tailored to how its employees work. They got there by building organizational capability through our five-step process.&lt;/p&gt;&lt;p&gt;As executives, it’s your job to take the lead on creating these systems. Now you have everything you need to get started, so it’s time to learn the tools, empower the right champions, choose the right pain points, and—most importantly—build.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;&lt;strong&gt;&lt;em&gt;&lt;u&gt;&lt;a href="https://every.to/@natalia_2944" rel="noopener noreferrer" target="_blank"&gt;Natalia Quintero&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;&lt;/strong&gt;&lt;em&gt; is the head of &lt;u&gt;&lt;a href="https://every.to/consulting" rel="noopener noreferrer" target="_blank"&gt;Every Consulting&lt;/a&gt;&lt;/u&gt;.&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;Thanks to &lt;/em&gt;&lt;strong&gt;&lt;em&gt;&lt;a href="https://www.linkedin.com/in/tommatsuda/" rel="noopener noreferrer" target="_blank"&gt;Tom Matsuda&lt;/a&gt;&lt;/em&gt;&lt;/strong&gt;&lt;em&gt; for editorial support.&lt;/em&gt;&lt;/p&gt;</description>
      <author>Natalia Quintero / Guides</author>
      <pubDate>2026-06-01 05:00:00 -0400</pubDate>
      <guid>https://every.to/guides/an-executive-s-guide-to-implementing-ai</guid>
      <link>https://every.to/guides/an-executive-s-guide-to-implementing-ai</link>
    </item>
    <item>
      <title>How We Work Now</title>
      <description>&lt;table&gt;&lt;tr&gt;&lt;td&gt;&lt;img alt="Context Window" src="https://d24ovhgu8s7341.cloudfront.net/uploads/publication/logo/94/small_context_windown_1.png" /&gt;&lt;/td&gt;&lt;td&gt;&lt;/td&gt;&lt;td&gt;&lt;table&gt;&lt;tr&gt;&lt;td&gt;by &lt;a href="https://every.to/@Every%20Staff" itemprop="name"&gt;Every Staff&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;in &lt;a href="https://every.to/context-window"&gt;Context Window&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;figure&gt;&lt;img src="https://d24ovhgu8s7341.cloudfront.net/uploads/post/cover/4280/full_page_cover_b98c4b989c7d0a2d-CW_Cover_Image.png"&gt;&lt;figcaption&gt;&lt;/figcaption&gt;&lt;/figure&gt;&lt;p&gt;Hello, and happy Sunday! This week was bookended by two guides: a 9,000-word &lt;u&gt;&lt;a href="https://every.to/p/how-to-use-codex-for-knowledge-work-a-power-user-s-guide" rel="noopener noreferrer" target="_blank"&gt;power user’s guide to Codex&lt;/a&gt;&lt;/u&gt;—&lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@danshipper" rel="noopener noreferrer" target="_blank"&gt;Dan Shipper&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;’s &lt;u&gt;&lt;a href="https://every.to/p/after-automation" rel="noopener noreferrer" target="_blank"&gt;“After Automation”&lt;/a&gt;&lt;/u&gt; essay put into practice the way the Every team has lately been working. And &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@kieran_1355" rel="noopener noreferrer" target="_blank"&gt;Kieran Klaassen&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; published an updated guide to compound engineering, Every’s AI-native development workflow, expanded from four steps to seven. We’re running camps for both—a &lt;u&gt;&lt;a href="https://every.to/events/compound-engineering-camp-3" rel="noopener noreferrer" target="_blank"&gt;Compound Engineering Camp&lt;/a&gt;&lt;/u&gt; on June 5 and a &lt;u&gt;&lt;a href="https://every.to/events/codex-camp-our-power-user-guide" rel="noopener noreferrer" target="_blank"&gt;Codex Camp&lt;/a&gt;&lt;/u&gt; on June 12.&lt;/p&gt;&lt;p&gt;Mid-week Anthropic dropped its latest model, &lt;u&gt;&lt;a href="https://every.to/vibe-check/opus-4-8-vibecheck" rel="noopener noreferrer" target="_blank"&gt;Opus 4.8&lt;/a&gt;&lt;/u&gt;, and in the words of Dan and &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@katie.parrott12" rel="noopener noreferrer" target="_blank"&gt;Katie Parrott&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;,&lt;em&gt; &lt;/em&gt;“Anthropic is so back.” The model tops our coding benchmark and writing tests, making it the company’s most complete model yet, though the app around it has some catching up to do. Anthropic and OpenAI have been volleying for the top of Every’s benchmarks for months. This week, Anthropic took the poin&lt;em&gt;t.&lt;/em&gt;—&lt;em&gt;&lt;u&gt;&lt;a href="https://every.to/@kate_1767" rel="noopener noreferrer" target="_blank"&gt;Kate Lee&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;Was this newsletter forwarded to you? &lt;u&gt;&lt;a href="https://every.to/account" rel="noopener noreferrer" target="_blank"&gt;Sign up&lt;/a&gt;&lt;/u&gt; to get it in your inbox&lt;/em&gt;.&lt;/p&gt;&lt;h2&gt;&lt;hr class="quill-line"&gt;&lt;/h2&gt;&lt;h2&gt;Knowledge base&lt;/h2&gt;&lt;p&gt;🔏 &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/guides/codex-for-knowledge-work" rel="noopener noreferrer" target="_blank"&gt;“Codex for Knowledge Work”&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; &lt;em&gt;by &lt;u&gt;&lt;a href="https://every.to/@katie.parrott12" rel="noopener noreferrer" target="_blank"&gt;Katie Parrott&lt;/a&gt;&lt;/u&gt;/&lt;u&gt;&lt;a href="https://every.to/guides" rel="noopener noreferrer" target="_blank"&gt;Guides&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;: &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@katie.parrott12" rel="noopener noreferrer" target="_blank"&gt;Katie Parrott&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;’s 9,000-word guide turns Codex into an operating system for knowledge work, with five levels of use (from one-off tasks to compounding systems), 13 workflow templates, and the full setup for context files, rules, and review checklists that make agents reliable across a full workday. A &lt;u&gt;&lt;a href="https://every.to/p/how-to-use-codex-for-knowledge-work-a-power-user-s-guide" rel="noopener noreferrer" target="_blank"&gt;companion essay&lt;/a&gt;&lt;/u&gt; covers the framing for readers new to Codex. Read this for the seven-day starter plan and the deeper templates.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;&lt;u&gt;&lt;a href="https://docs.google.com/document/d/1nhsr0FAiyLwB7WeYC9tr0rJawFVcJPuIFRtlJKXu1uk/edit?tab=t.0#heading=h.m9mcsek4x7wg" rel="noopener noreferrer" target="_blank"&gt;“Compound Engineering”&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;&lt;em&gt; by &lt;u&gt;&lt;a href="https://every.to/@kieran_1355" rel="noopener noreferrer" target="_blank"&gt;Kieran Klaassen&lt;/a&gt;&lt;/u&gt; and Trevin Chow/&lt;u&gt;&lt;a href="https://every.to/guides" rel="noopener noreferrer" target="_blank"&gt;Guides&lt;/a&gt;&lt;/u&gt;:&lt;/em&gt; The compound engineering loop has been expanded from four steps to seven. Ideate and plan move to the front, and polish to the end—now that AI handles the middle of the cycle. The updated plugin ships 43 subagents and 38 slash-command skills. In a &lt;u&gt;&lt;a href="https://every.to/guides/compound-engineering-gets-an-upgrade" rel="noopener noreferrer" target="_blank"&gt;companion essay&lt;/a&gt;&lt;/u&gt;, &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@kieran_1355" rel="noopener noreferrer" target="_blank"&gt;Kieran Klaassen&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; explains the new paradigm of a sandwich: AI in the middle, with humans the bread on either end. Read this for the new loop and what each step demands of you&lt;strong&gt;.&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;&lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/vibe-check/opus-4-8-vibecheck" rel="noopener noreferrer" target="_blank"&gt;“Vibe Check: Opus 4.8—Anthropic Should’ve Rounded Up to 5”&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; &lt;em&gt;by &lt;u&gt;&lt;a href="https://every.to/@danshipper" rel="noopener noreferrer" target="_blank"&gt;Dan Shipper&lt;/a&gt;&lt;/u&gt; and &lt;u&gt;&lt;a href="https://every.to/@katie.parrott12" rel="noopener noreferrer" target="_blank"&gt;Katie Parrott&lt;/a&gt;&lt;/u&gt;/&lt;u&gt;&lt;a href="https://every.to/vibe-check" rel="noopener noreferrer" target="_blank"&gt;Vibe Check&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;: Opus 4.8 is the first Anthropic release in a year &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@danshipper" rel="noopener noreferrer" target="_blank"&gt;Dan Shipper&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; and Katie&lt;strong&gt; &lt;/strong&gt;would reach for across coding, prose, and everyday work alike. It scored 63 on Every’s Senior Engineer Benchmark versus 62 for GPT-5.5 and 33.5 for Opus 4.7, and 79.6 on the writing tests—the highest score any model has hit, with fewer AI tells than any non-Claude model. Read this for the benchmark breakdowns and the case for why the model now outpaces the app built around it.&lt;/p&gt;&lt;p&gt;🎧 🖥  &lt;u&gt;&lt;a href="https://www.youtube.com/watch?v=dCmOTURRf1Y&amp;amp;feature=youtu.be" rel="noopener noreferrer" target="_blank"&gt;“&lt;/a&gt;&lt;/u&gt;&lt;strong&gt;&lt;u&gt;&lt;a href="https://www.youtube.com/watch?v=dCmOTURRf1Y&amp;amp;feature=youtu.be" rel="noopener noreferrer" target="_blank"&gt;We Automated Everything With AI and Tripled Our Headcount”&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; &lt;em&gt;by &lt;u&gt;&lt;a href="https://every.to/@danshipper" rel="noopener noreferrer" target="_blank"&gt;Dan Shipper&lt;/a&gt;&lt;/u&gt;/&lt;u&gt;&lt;a href="https://every.to/podcast" rel="noopener noreferrer" target="_blank"&gt;AI &amp;amp; I&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;: In “After Automation,” Dan argues that AI progress creates more work for humans, not less. The better models get, the more frames there are to hand them. Every COO &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@brandon_5263" rel="noopener noreferrer" target="_blank"&gt;Brandon Gell&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; sits down with Dan to press on each premise. Watch or listen to this for the oral version of the thesis. 🎧 🖥 Listen on &lt;u&gt;&lt;a href="https://open.spotify.com/episode/58rbN4WgcbESfA37XDik7C?si=U0ezF-mZRH2qoJR9vCxb1Q" rel="noopener noreferrer" target="_blank"&gt;Spotify&lt;/a&gt;&lt;/u&gt; or &lt;u&gt;&lt;a href="https://podcasts.apple.com/us/podcast/we-automated-everything-with-ai-and-tripled-our-headcount/id1719789201?i=1000769857409" rel="noopener noreferrer" target="_blank"&gt;Apple Podcasts&lt;/a&gt;&lt;/u&gt;, watch on &lt;u&gt;&lt;a href="https://youtu.be/dCmOTURRf1Y" rel="noopener noreferrer" target="_blank"&gt;YouTube&lt;/a&gt;&lt;/u&gt;, or follow the discussion on &lt;u&gt;&lt;a href="https://x.com/danshipper/status/2059673326247625084" rel="noopener noreferrer" target="_blank"&gt;X&lt;/a&gt;&lt;/u&gt;.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/context-window/after-after-automation" rel="noopener noreferrer" target="_blank"&gt;“After ‘After Automation’”&lt;/a&gt;&lt;/u&gt; &lt;/strong&gt;by &lt;em&gt;&lt;u&gt;&lt;a href="https://every.to/@katie.parrott12" rel="noopener noreferrer" target="_blank"&gt;Katie Parrott&lt;/a&gt;&lt;/u&gt;/&lt;u&gt;&lt;a href="https://every.to/context-window" rel="noopener noreferrer" target="_blank"&gt;Context Window&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;: Katie reads &lt;strong&gt;Pope Leo XIV&lt;/strong&gt;’s &lt;em&gt;Magnifica Humanitas&lt;/em&gt;—the Vatican’s first major encyclical on AI—as a collective companion to Dan’s thesis. Read this for what theyagree and disagree on about AI and labor.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;h2&gt;Log on&lt;/h2&gt;&lt;p&gt;Get hands-on with how Every uses AI. These are the &lt;u&gt;&lt;a href="https://every.to/events" rel="noopener noreferrer" target="_blank"&gt;live camps, workshops, and meetups&lt;/a&gt;&lt;/u&gt; where team members teach the workflows behind our work.&lt;/p&gt;&lt;h5&gt;Upcoming camp&lt;/h5&gt;&lt;p&gt;&lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/events/compound-engineering-camp-3" rel="noopener noreferrer" target="_blank"&gt;Compound Engineering Camp&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;: On June 5, Cora general manager &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@kieran_1355" rel="noopener noreferrer" target="_blank"&gt;Kieran Klaassen&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; and &lt;strong&gt;Trevin Chow&lt;/strong&gt; host a one-hour walkthrough of compound engineering, the AI-native development workflow Every uses to ship products. &lt;u&gt;&lt;a href="https://every.to/events/compound-engineering-camp-3" rel="noopener noreferrer" target="_blank"&gt;Learn more and register&lt;/a&gt;&lt;/u&gt;.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/events/codex-camp-our-power-user-guide" rel="noopener noreferrer" target="_blank"&gt;Codex Camp: Our Power User Guide&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;: On June 12, Dan and the Every team host a two-hour live walkthrough of the Codex power-user guide—setup, workflows, and Codex-native app development. &lt;u&gt;&lt;a href="https://every.to/events/codex-camp-our-power-user-guide" rel="noopener noreferrer" target="_blank"&gt;Learn more and register&lt;/a&gt;&lt;/u&gt;.&lt;/p&gt;&lt;h5&gt;Upcoming event&lt;/h5&gt;&lt;ul&gt;&lt;li&gt;&lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/events/executive-AI-sessions" rel="noopener noreferrer" target="_blank"&gt;Executive AI Sessions&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;: On June 2, head of consulting &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@natalia_2944" rel="noopener noreferrer" target="_blank"&gt;Natalia Quintero&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; hosts a live webinar introducing &lt;u&gt;&lt;a href="https://every.to/consulting" rel="noopener noreferrer" target="_blank"&gt;Every Consulting&lt;/a&gt;&lt;/u&gt;’s new offering for leadership teams navigating AI adoption—built on the playbook we’ve been running with executive clients for months. &lt;u&gt;&lt;a href="https://every.to/events/executive-AI-sessions" rel="noopener noreferrer" target="_blank"&gt;Learn more and register&lt;/a&gt;&lt;/u&gt;.&lt;/li&gt;&lt;/ul&gt;&lt;h5&gt;In New York City&lt;/h5&gt;&lt;ul&gt;&lt;li&gt;&lt;strong&gt;&lt;u&gt;&lt;a href="https://luma.com/2o67t7ob" rel="noopener noreferrer" target="_blank"&gt;Every 🤝 IRL&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;: Join us at the Every brownstone in Brooklyn on June 3 during New York Tech Week for a subscriber-only meetup celebrating the Every community over drinks and conversation. &lt;u&gt;&lt;a href="https://luma.com/2o67t7ob" rel="noopener noreferrer" target="_blank"&gt;Learn more and RSVP&lt;/a&gt;&lt;/u&gt;.&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;h2&gt;From Every Studio&lt;/h2&gt;&lt;h5&gt;&lt;strong&gt;Proof keeps your name on shared docs&lt;/strong&gt;&lt;/h5&gt;&lt;p&gt;&lt;strong&gt;&lt;u&gt;&lt;a href="https://www.proofeditor.ai/" rel="noopener noreferrer" target="_blank"&gt;Proof&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;, where humans and AI agents work on documents together, got eight new PRs this week, all focused on collaborative editing. Shared documents are now attributed to the first human who opens them (instead of the system), and your edits preserve your name through the full pipeline—no more anonymous tracked changes. &lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;h2&gt;Alignment&lt;/h2&gt;&lt;p&gt;&lt;strong&gt;The right kind of nervous.&lt;/strong&gt; &lt;u&gt;&lt;a href="https://every.to/context-window/the-missing-layer-in-ai-adoption" rel="noopener noreferrer" target="_blank"&gt;A few months ago&lt;/a&gt;&lt;/u&gt; I wrote about Doctronic, the company running a pilot in Utah to let an AI handle prescription renewals, and on Friday the state’s &lt;u&gt;&lt;a href="https://commerce.utah.gov/wp-content/uploads/2026/05/Doctronic-Outcomes-May-2026.pdf" rel="noopener noreferrer" target="_blank"&gt;Office of AI Policy&lt;/a&gt;&lt;/u&gt; released the first five months of results. (The AI gathers a patient’s information and either recommends a renewal that a human physician signs off on, or declines and escalates the case to a doctor.)&lt;/p&gt;&lt;p&gt;In 72 percent of cases the AI recommended renewal, and the reviewing physician agreed nine times out of ten. In the 9 percent where a physician wanted more information, a second physician was brought in and usually decided it wasn’t needed. After both reviews, 97 percent of the recommendations stood. The office estimates humans get it wrong 5 to 12 percent of the time.&lt;/p&gt;&lt;p&gt;But the most reassuring data is that of the 28 percent of cases the AI escalated to a physician, doctors backed the call 69 percent of the time and judged the AI overcautious in the rest. For a pilot, that overcaution is wonderful—you want a system tuned to catch every genuinely risky case even if it stops some perfectly fine ones. A confident system that waves prescriptions through g is the one that should frighten you.&lt;/p&gt;&lt;p&gt;When I was doing rounds many years ago, I was told that the most dangerous doctors are the junior ones who are overconfident and the safest tend to be the overworriers who escalate everything, warranted or not. They do so precisely because they are still learning where the line sits, and that overcaution is how they find it. The Doctronic AI is behaving like a nervous junior, and at this stage, that’s the most encouraging thing it could do.—&lt;em&gt;&lt;u&gt;&lt;a href="https://x.com/Ashwinreads" rel="noopener noreferrer" target="_blank"&gt;Ashwin Sharma&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;That’s all for this week! Be sure to follow Every on X at &lt;u&gt;&lt;a href="https://twitter.com/every" rel="noopener noreferrer" target="_blank"&gt;@every&lt;/a&gt;&lt;/u&gt; and on &lt;u&gt;&lt;a href="https://www.linkedin.com/company/everyinc/" rel="noopener noreferrer" target="_blank"&gt;LinkedIn&lt;/a&gt;&lt;/u&gt;.&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;We &lt;u&gt;&lt;a href="https://every.to/studio" rel="noopener noreferrer" target="_blank"&gt;build AI tools&lt;/a&gt;&lt;/u&gt; for readers like you. Write brilliantly with &lt;/em&gt;&lt;strong&gt;&lt;em&gt;&lt;u&gt;&lt;a href="https://writewithspiral.com/" rel="noopener noreferrer" target="_blank"&gt;Spiral&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;&lt;/strong&gt;&lt;em&gt;. Organize files automatically with &lt;/em&gt;&lt;strong&gt;&lt;em&gt;&lt;u&gt;&lt;a href="https://makeitsparkle.co/?utm_source=everyfooter" rel="noopener noreferrer" target="_blank"&gt;Sparkle&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;&lt;/strong&gt;&lt;em&gt;. Deliver yourself from email with &lt;u&gt;&lt;a href="https://cora.computer" rel="noopener noreferrer" target="_blank"&gt;Cora&lt;/a&gt;&lt;/u&gt;. Dictate effortlessly with &lt;/em&gt;&lt;strong&gt;&lt;em&gt;&lt;u&gt;&lt;a href="https://monologue.to" rel="noopener noreferrer" target="_blank"&gt;Monologue&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;&lt;/strong&gt;&lt;em&gt;. Work on documents with AI agents using &lt;u&gt;&lt;a href="https://www.proofeditor.ai/?source=post_button" rel="noopener noreferrer" target="_blank"&gt;Proof&lt;/a&gt;&lt;/u&gt;.&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;For sponsorship opportunities, reach out to sponsorships@every.to.&lt;/em&gt;&lt;/p&gt;&lt;div class="quill-button" data-source="{&amp;quot;id&amp;quot;:&amp;quot;quill-button-1780087985894&amp;quot;,&amp;quot;text&amp;quot;:&amp;quot;Upgrade to paid&amp;quot;,&amp;quot;url&amp;quot;:&amp;quot;https://every.to/subscribe?source=post_button&amp;quot;}" id="quill-button-1780087985894"&gt;&lt;a href="https://every.to/subscribe?source=post_button"&gt;Upgrade to paid&lt;/a&gt;&lt;/div&gt;</description>
      <author>Every Staff / Context Window</author>
      <pubDate>2026-05-30 20:00:00 -0400</pubDate>
      <guid>https://every.to/context-window/how-we-work-now</guid>
      <link>https://every.to/context-window/how-we-work-now</link>
    </item>
    <item>
      <title>Compound Engineering Gets an Upgrade</title>
      <description>&lt;table&gt;&lt;tr&gt;&lt;td&gt;&lt;/td&gt;&lt;td&gt;&lt;table&gt;&lt;tr&gt;&lt;td&gt;by &lt;a href="https://every.to/@kieran_1355" itemprop="name"&gt;Kieran Klaassen&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;figure&gt;&lt;img src="https://d24ovhgu8s7341.cloudfront.net/uploads/post/cover/4279/full_page_cover_bb2331c7dc7f0afe-Compound_engineering.png"&gt;&lt;figcaption&gt;Midjourney/Every illustration.&lt;/figcaption&gt;&lt;/figure&gt;&lt;p&gt;&lt;em&gt;Join me and &lt;/em&gt;&lt;strong&gt;&lt;em&gt;Trevin Chow&lt;/em&gt;&lt;/strong&gt;&lt;em&gt; for our third compound engineering camp for paid subscribers next Friday, &lt;/em&gt;&lt;strong&gt;&lt;em&gt;June 5&lt;/em&gt;&lt;/strong&gt;&lt;em&gt;. We’ll show how planning and building are collapsing into one flow—where you hand your AI a goal and it runs with it. &lt;u&gt;&lt;a href="https://every.to/events/compound-engineering-camp-3" rel="noopener noreferrer" target="_blank"&gt;RSVP&lt;/a&gt;&lt;/u&gt;.&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;Was this newsletter forwarded to you? &lt;u&gt;&lt;a href="https://every.to/account" rel="noopener noreferrer" target="_blank"&gt;Sign up&lt;/a&gt;&lt;/u&gt; to get it in your inbox.&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;In its early days, &lt;u&gt;&lt;a href="https://every.to/chain-of-thought/compound-engineering-how-every-codes-with-agents" rel="noopener noreferrer" target="_blank"&gt;compound engineering&lt;/a&gt;&lt;/u&gt; was mostly about the code. I wanted to see if I could get an AI model to make a plan, do the work the way I wanted it done, review the results against my standards, and incorporate lessons from my feedback so it wouldn’t make the same mistake next time. The loop looked like this: &lt;/p&gt;&lt;p&gt;&lt;strong&gt;Brainstorm → work → review → compound → repeat&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;That loop is still the core of how I build &lt;strong&gt;&lt;u&gt;&lt;a href="http://cora.computer" rel="noopener noreferrer" target="_blank"&gt;Cora&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;. But almost a year after we first &lt;u&gt;&lt;a href="https://every.to/source-code/my-ai-had-already-fixed-the-code-before-i-saw-it" rel="noopener noreferrer" target="_blank"&gt;coined the term&lt;/a&gt;&lt;/u&gt; compound engineering, the work phase has become boring—in the best way. If the plan is good and the agent has the right context, it usually does the work right. It writes the code and runs the tests. It fixes the obvious issues. The question now is: “Where do I fit in?”&lt;/p&gt;&lt;p&gt;The answer is at both ends of the process. An analogy my collaborator on the &lt;u&gt;&lt;a href="https://github.com/everyinc/compound-engineering-plugin" rel="noopener noreferrer" target="_blank"&gt;compound engineering plugin&lt;/a&gt;&lt;/u&gt;, &lt;strong&gt;&lt;u&gt;&lt;a href="https://x.com/trevin" rel="noopener noreferrer" target="_blank"&gt;Trevin Chow&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;, came up with is a &lt;u&gt;&lt;a href="https://every.to/context-window/you-re-the-bread-in-the-ai-sandwich" rel="noopener noreferrer" target="_blank"&gt;sandwich&lt;/a&gt;&lt;/u&gt;. AI is the stuff in the middle. Humans are the bread on either end, holding it together. &lt;/p&gt;&lt;p&gt;At the beginning, I need to decide what is worth building. I need to understand the user, the product, the weird edge cases, and the thing that feels exciting enough to spend time on. Then I can hand the middle to the agent. At the end, I come back in. I click around and look at the design. I read the copy. I ask whether the experience &lt;em&gt;feels&lt;/em&gt; right. Sometimes everything technically works, but the product is still not good. So I make it better. &lt;/p&gt;&lt;p&gt;As the models have grown more capable, the original compound engineering loop started to feel incomplete. Plan, work, review, and compound still describes the core engineering cycle, but it leaves out the two places where I now spend most of my attention: before there is a plan, and after the work technically passes review.&lt;/p&gt;&lt;p&gt;So I expanded the loop:&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Ideate → brainstorm → plan → work → review → polish → compound → repeat&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;Ideate and brainstorm are the new front of the process. Polish is the new end. Compound is still the most important step, because the whole point is that every feature should make the next feature easier.&lt;/p&gt;&lt;p&gt;I updated the compound engineering guide to explain the full system. The guide is about engineering, but I think the pattern applies to knowledge work much more broadly. The middle of a lot of work will get automated. But if you want the work to be good, and if you want it to feel like yours, you still need to be there at the beginning and the end.&lt;/p&gt;&lt;div class="quill-button" data-source="{&amp;quot;id&amp;quot;:&amp;quot;quill-button-1780074520202&amp;quot;,&amp;quot;text&amp;quot;:&amp;quot;Read the updated compound engineering guide&amp;quot;,&amp;quot;url&amp;quot;:&amp;quot;https://every.to/guides/compound-engineering?source=post_button&amp;quot;}" id="quill-button-1780074520202"&gt;&lt;a href="https://every.to/guides/compound-engineering?source=post_button"&gt;Read the updated compound engineering guide&lt;/a&gt;&lt;/div&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;&lt;strong&gt;&lt;em&gt;&lt;a href="https://every.to/@kieran_1355" rel="noopener noreferrer" target="_blank"&gt;Kieran Klaassen&lt;/a&gt;&lt;/em&gt;&lt;/strong&gt; &lt;em&gt;is the general manager of&lt;/em&gt; &lt;em&gt;&lt;a href="https://cora.computer/" rel="noopener noreferrer" target="_blank"&gt;Cora&lt;/a&gt;, Every’s email product. Follow him on X at&lt;/em&gt; &lt;em&gt;&lt;a href="https://x.com/kieranklaassen" rel="noopener noreferrer" target="_blank"&gt;@kieranklaassen&lt;/a&gt;&lt;/em&gt; &lt;em&gt;or on&lt;/em&gt; &lt;em&gt;&lt;a href="https://www.linkedin.com/in/kieran-klaassen/" rel="noopener noreferrer" target="_blank"&gt;LinkedIn&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;To read more essays like this, subscribe to &lt;u&gt;&lt;a href="https://every.to/subscribe" rel="noopener noreferrer" target="_blank"&gt;Every&lt;/a&gt;&lt;/u&gt;, and follow us on X at &lt;u&gt;&lt;a href="http://twitter.com/every" rel="noopener noreferrer" target="_blank"&gt;@every&lt;/a&gt;&lt;/u&gt; and on &lt;u&gt;&lt;a href="https://www.linkedin.com/company/everyinc/" rel="noopener noreferrer" target="_blank"&gt;LinkedIn&lt;/a&gt;&lt;/u&gt;.&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;We &lt;u&gt;&lt;a href="https://every.to/studio" rel="noopener noreferrer" target="_blank"&gt;build AI tools&lt;/a&gt;&lt;/u&gt; for readers like you. Write brilliantly with &lt;/em&gt;&lt;strong&gt;&lt;em&gt;&lt;u&gt;&lt;a href="https://writewithspiral.com/" rel="noopener noreferrer" target="_blank"&gt;Spiral&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;&lt;/strong&gt;&lt;em&gt;. Organize files automatically with &lt;/em&gt;&lt;strong&gt;&lt;em&gt;&lt;u&gt;&lt;a href="https://makeitsparkle.co/?utm_source=everyfooter" rel="noopener noreferrer" target="_blank"&gt;Sparkle&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;&lt;/strong&gt;&lt;em&gt;. Deliver yourself from email with &lt;u&gt;&lt;a href="https://cora.computer/" rel="noopener noreferrer" target="_blank"&gt;Cora&lt;/a&gt;&lt;/u&gt;. Dictate effortlessly with &lt;/em&gt;&lt;strong&gt;&lt;em&gt;&lt;u&gt;&lt;a href="https://monologue.to/" rel="noopener noreferrer" target="_blank"&gt;Monologue&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;&lt;/strong&gt;&lt;em&gt;. Collaborate with agents on documents with &lt;/em&gt;&lt;strong&gt;&lt;em&gt;&lt;a href="https://www.proofeditor.ai/" rel="noopener noreferrer" target="_blank"&gt;Proof&lt;/a&gt;&lt;/em&gt;&lt;/strong&gt;&lt;em&gt;. &lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;For sponsorship opportunities, reach out to sponsorships@every.to.&lt;/em&gt;&lt;/p&gt;&lt;div class="quill-button" data-source="{&amp;quot;id&amp;quot;:&amp;quot;quill-button-1769187301610&amp;quot;,&amp;quot;url&amp;quot;:&amp;quot;https://every.to/subscribe?source=post_button&amp;quot;,&amp;quot;text&amp;quot;:&amp;quot;Subscribe&amp;quot;}" id="quill-button-1769187301610"&gt;&lt;a href="https://every.to/subscribe?source=post_button"&gt;Subscribe&lt;/a&gt;&lt;/div&gt;</description>
      <author>Kieran Klaassen</author>
      <pubDate>2026-05-29 05:00:00 -0400</pubDate>
      <guid>https://every.to/p/compound-engineering-gets-an-upgrade</guid>
      <link>https://every.to/p/compound-engineering-gets-an-upgrade</link>
    </item>
    <item>
      <title>Vibe Check: Opus 4.8—Anthropic Should’ve Rounded Up to 5</title>
      <description>&lt;table&gt;&lt;tr&gt;&lt;td&gt;&lt;img alt="Vibe Check" src="https://d24ovhgu8s7341.cloudfront.net/uploads/publication/logo/101/small_Frame_48095758.png" /&gt;&lt;/td&gt;&lt;td&gt;&lt;/td&gt;&lt;td&gt;&lt;table&gt;&lt;tr&gt;&lt;td&gt;by &lt;a href="https://every.to/@danshipper" itemprop="name"&gt;Dan Shipper&lt;/a&gt; and &lt;a href="https://every.to/@katie.parrott12" itemprop="name"&gt;Katie Parrott&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;in &lt;a href="https://every.to/vibe-check"&gt;Vibe Check&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;figure&gt;&lt;img src="https://d24ovhgu8s7341.cloudfront.net/uploads/post/cover/4277/full_page_cover_ddd192a1878e9f01-Opus_-_vc.png"&gt;&lt;figcaption&gt;&lt;/figcaption&gt;&lt;/figure&gt;&lt;p&gt;&lt;em&gt;Was this newsletter forwarded to you? &lt;u&gt;&lt;a href="https://every.to/account" rel="noopener noreferrer" target="_blank"&gt;Sign up&lt;/a&gt;&lt;/u&gt; to get it in your inbox.&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;Anthropic is back.&lt;/p&gt;&lt;p&gt;After a year of riding Claude Code into the rest of knowledge work, the lab hit a rough patch: &lt;u&gt;&lt;a href="https://every.to/vibe-check/opus-4-7" rel="noopener noreferrer" target="_blank"&gt;Opus 4.7&lt;/a&gt;&lt;/u&gt; was hard to love, and OpenAI’s &lt;u&gt;&lt;a href="https://every.to/vibe-check/vibe-check-openai-s-codex-app-gains-ground-on-claude-code" rel="noopener noreferrer" target="_blank"&gt;Codex desktop app&lt;/a&gt;&lt;/u&gt; pulled even devoted Claude users from our team to GPT models. Opus 4.8, out today, has us running back—for the model, if not the app around it. It tops our Senior Engineer Benchmark and our writing tests at once, and it’s the first Anthropic release in a year we’d reach for across coding, prose, and everyday work.&lt;/p&gt;&lt;p&gt;The big insights from our testing:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;&lt;strong&gt;Best on senior-engineer coding.&lt;/strong&gt; At extra-high effort, Opus 4.8 scored 63 on our Senior Engineer Benchmark, versus 62 for GPT-5.5 and 33.5 for Opus 4.7. At lower effort settings, the score drops significantly. &lt;/li&gt;&lt;li&gt;&lt;strong&gt;The strongest writing model we’ve tested.&lt;/strong&gt; Opus 4.8 at high effort scored 79.6, ahead of Sonnet 4.6 (74.5), GPT-5.5 (73), and Opus 4.7 (63), with fewer AI tells than any non-Claude model.&lt;/li&gt;&lt;li&gt;&lt;strong&gt;Best one-shot PowerPoint we’ve seen.&lt;/strong&gt; On our Every Consulting Benchmark, Opus 4.8 produced a well-designed deck that told a clear story—something most models still can’t do.&lt;/li&gt;&lt;li&gt;&lt;strong&gt;The model is stronger than the app.&lt;/strong&gt; Opus 4.8 is good enough to make us want to live in Claude, but the split between Chat, Code, and Cowork keeps Codex as the better daily harness.&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;The full &lt;a href="https://every.to/vibe-check/opus-4-8-vibecheck" rel="noopener noreferrer" target="_blank"&gt;Vibe Check&lt;/a&gt; has the benchmark results, Reach Test ratings, pricing, screenshots, and advice on when to reach for Opus 4.8 versus GPT-5.5.&lt;/p&gt;&lt;div class="quill-button" data-source="{&amp;quot;id&amp;quot;:&amp;quot;quill-button-1779984271887&amp;quot;,&amp;quot;text&amp;quot;:&amp;quot;Read the full Vibe Check&amp;quot;,&amp;quot;url&amp;quot;:&amp;quot;https://every.to/vibe-check/opus-4-8-vibecheck?source=post_button&amp;quot;}" id="quill-button-1779984271887"&gt;&lt;a href="https://every.to/vibe-check/opus-4-8-vibecheck?source=post_button"&gt;Read the full Vibe Check&lt;/a&gt;&lt;/div&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;&lt;strong&gt;&lt;em&gt;&lt;a href="https://every.to/@danshipper" rel="noopener noreferrer" target="_blank"&gt;Dan Shipper&lt;/a&gt;&lt;/em&gt;&lt;/strong&gt; &lt;em&gt;is the cofounder and CEO of Every, where he writes the&lt;/em&gt; &lt;em&gt;&lt;a href="https://every.to/chain-of-thought" rel="noopener noreferrer" target="_blank"&gt;Chain of Thought&lt;/a&gt;&lt;/em&gt; &lt;em&gt;column and hosts the podcast&lt;/em&gt; &lt;a href="https://open.spotify.com/show/5qX1nRTaFsfWdmdj5JWO1G" rel="noopener noreferrer" target="_blank"&gt;AI &amp;amp; I&lt;/a&gt;. &lt;em&gt;You can follow him on X at&lt;/em&gt; &lt;em&gt;&lt;a href="https://twitter.com/danshipper" rel="noopener noreferrer" target="_blank"&gt;@danshipper&lt;/a&gt;&lt;/em&gt; &lt;em&gt;and on&lt;/em&gt; &lt;em&gt;&lt;a href="https://www.linkedin.com/in/danshipper/" rel="noopener noreferrer" target="_blank"&gt;LinkedIn&lt;/a&gt;. &lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;strong&gt;&lt;em&gt;&lt;a href="https://every.to/@katie.parrott12" rel="noopener noreferrer" target="_blank"&gt;Katie Parrott&lt;/a&gt;&lt;/em&gt;&lt;/strong&gt; &lt;em&gt;is a staff writer at Every. You can read more of her work in&lt;/em&gt; &lt;em&gt;&lt;a href="https://katieparrott.substack.com/" rel="noopener noreferrer" target="_blank"&gt;her newsletter&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;To read more essays like this, subscribe to &lt;a href="https://every.to/subscribe?utm_source=opus_4_8_vibecheck_footer" rel="noopener noreferrer" target="_blank"&gt;Every&lt;/a&gt;, and follow us on X at &lt;a href="https://twitter.com/every" rel="noopener noreferrer" target="_blank"&gt;@every&lt;/a&gt; and on &lt;a href="https://www.linkedin.com/company/everyinc/" rel="noopener noreferrer" target="_blank"&gt;LinkedIn&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;We also do AI training, adoption, and innovation for companies. &lt;a href="https://every.to/consulting" rel="noopener noreferrer" target="_blank"&gt;Work with us&lt;/a&gt; to bring AI into your organization.&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;Discover Every’s &lt;a href="https://every.to/events" rel="noopener noreferrer" target="_blank"&gt;upcoming workshops and camps&lt;/a&gt;, and access recordings from past events.&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;For sponsorship opportunities, reach out to &lt;a href="mailto:sponsorships@every.to" rel="noopener noreferrer" target="_blank"&gt;sponsorships@every.to&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;&lt;div class="quill-button" data-source="{&amp;quot;id&amp;quot;:&amp;quot;quill-button-1769530239147&amp;quot;,&amp;quot;url&amp;quot;:&amp;quot;https://every.to/subscribe?source=post_button&amp;quot;,&amp;quot;text&amp;quot;:&amp;quot;Subscribe&amp;quot;}" id="quill-button-1769530239147"&gt;&lt;a href="https://every.to/subscribe?source=post_button"&gt;Subscribe&lt;/a&gt;&lt;/div&gt;</description>
      <author>Dan Shipper and Katie Parrott / Vibe Check</author>
      <pubDate>2026-05-28 08:00:00 -0400</pubDate>
      <guid>https://every.to/vibe-check/opus-4-8-vibecheck</guid>
      <link>https://every.to/vibe-check/opus-4-8-vibecheck</link>
    </item>
    <item>
      <title>After ‘After Automation’</title>
      <description>&lt;table&gt;&lt;tr&gt;&lt;td&gt;&lt;img alt="Context Window" src="https://d24ovhgu8s7341.cloudfront.net/uploads/publication/logo/94/small_context_windown_1.png" /&gt;&lt;/td&gt;&lt;td&gt;&lt;/td&gt;&lt;td&gt;&lt;table&gt;&lt;tr&gt;&lt;td&gt;by &lt;a href="https://every.to/@katie.parrott12" itemprop="name"&gt;Katie Parrott&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;in &lt;a href="https://every.to/context-window"&gt;Context Window&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;figure&gt;&lt;img src="https://d24ovhgu8s7341.cloudfront.net/uploads/post/cover/4276/full_page_cover_e912cd5d7369732a-Cover_podcast_after_after_1.png"&gt;&lt;figcaption&gt;Dan Shipper (left) and Brandon Gell. Midjourney/Every illustration.&lt;/figcaption&gt;&lt;/figure&gt;&lt;p&gt;&lt;em&gt;Was this newsletter forwarded to you? &lt;u&gt;&lt;a href="https://every.to/account" rel="noopener noreferrer" target="_blank"&gt;Sign up&lt;/a&gt;&lt;/u&gt; to get it in your inbox.&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;h3&gt;‘AI &amp;amp; I’: More machine, more human work &lt;/h3&gt;&lt;p&gt;Today, we’re releasing a new episode of our podcast, &lt;em&gt;&lt;u&gt;&lt;a href="https://every.to/podcast" rel="noopener noreferrer" target="_blank"&gt;AI &amp;amp; I&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;. In a format flip, &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@danshipper" rel="noopener noreferrer" target="_blank"&gt;Dan Shipper&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; sits down with Every’s COO &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@brandon_5263" rel="noopener noreferrer" target="_blank"&gt;Brandon Gell&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; not to interview a guest, but to be interviewed himself on why automating everything leads to more human work. The occasion is &lt;u&gt;&lt;a href="https://every.to/chain-of-thought/after-automation" rel="noopener noreferrer" target="_blank"&gt;“After Automation,”&lt;/a&gt;&lt;/u&gt; Dan’s 8,000-word argument on the topic that became our most viral piece of the year, &lt;u&gt;&lt;a href="https://x.com/lennysan/status/2058984089957654621" rel="noopener noreferrer" target="_blank"&gt;driving&lt;/a&gt;&lt;/u&gt; the &lt;u&gt;&lt;a href="https://x.com/pmarca/status/2058665266687725800" rel="noopener noreferrer" target="_blank"&gt;AI discourse&lt;/a&gt;&lt;/u&gt; on X for a couple days.&lt;/p&gt;&lt;p&gt;It’s a counterintuitive thesis from someone who runs a company that’s automated every single thing it can. And yet Every has grown from four people to 30 in the GPT era, with &lt;u&gt;&lt;a href="https://every.to/p/what-i-learned-onboarding-our-ai-project-manager" rel="noopener noreferrer" target="_blank"&gt;agents embedded&lt;/a&gt;&lt;/u&gt; into nearly every workflow. Dan’s point isn’t that AI won’t change work—it already has—but that it drives up the demand for human expertise, judgment, and &lt;u&gt;&lt;a href="https://every.to/p/what-is-taste-really" rel="noopener noreferrer" target="_blank"&gt;taste&lt;/a&gt;&lt;/u&gt;.&lt;/p&gt;&lt;p&gt;Watch on &lt;a href="https://x.com/danshipper/status/2059673326247625084" rel="noopener noreferrer" target="_blank"&gt;X&lt;/a&gt; or &lt;a href="https://youtu.be/dCmOTURRf1Y" rel="noopener noreferrer" target="_blank"&gt;YouTube&lt;/a&gt;, or listen on &lt;a href="https://open.spotify.com/episode/58rbN4WgcbESfA37XDik7C?si=U0ezF-mZRH2qoJR9vCxb1Q" rel="noopener noreferrer" target="_blank"&gt;Spotify&lt;/a&gt; or &lt;a href="https://podcasts.apple.com/us/podcast/we-automated-everything-with-ai-and-tripled-our-headcount/id1719789201?i=1000769857409" rel="noopener noreferrer" target="_blank"&gt;Apple Podcasts&lt;/a&gt;. You can also read the &lt;a href="https://every.to/podcast/transcript-we-automated-everything-with-ai-and-tripled-our-headcount" rel="noopener noreferrer" target="_blank"&gt;transcript&lt;/a&gt;.&lt;/p&gt;&lt;p&gt;Here are the highlights:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;&lt;strong&gt;AI makes experts more valuable.&lt;/strong&gt; When everyone can produce a decent first draft—of code, writing, design—the floor rises, but so does the amount of comparable content. “You flood the zone with tons of stuff that’s close, but not quite right,” Dan says. Getting from close to memorable requires experts who can work with AI to rise above the new baseline.&lt;/li&gt;&lt;li&gt;&lt;strong&gt;The goalposts will keep moving.&lt;/strong&gt; Models &lt;u&gt;&lt;a href="https://every.to/vibe-check/gpt-5-5" rel="noopener noreferrer" target="_blank"&gt;improve exponentially&lt;/a&gt;&lt;/u&gt; on benchmarks precisely because benchmarks are fixed frames, or existing ways of posing a problem the model can train on. Humans remain indispensable because we can operate outside established frames entirely—we zoom out, recenter the problem, and make surprising, self-directed choices that don’t exist anywhere in the training data.&lt;/li&gt;&lt;li&gt;&lt;strong&gt;“AI layoffs” are usually a cover story.&lt;/strong&gt; Meta and ClickUp, among other tech companies, have recently laid off people and blamed AI. Dan and Brandon’s read on the trend is the same: AI is an easier explanation than admitting your company hired too many people or is in financial straits. AI will undoubtedly change how people do their jobs—and big, structurally rigid companies will have to reorganize around that—but that’s different from the technology eliminating jobs.&lt;/li&gt;&lt;li&gt;&lt;strong&gt;Ride the models and you’ll be fine.&lt;/strong&gt; The paradox at the heart of Dan’s essay is that AI creates more work for humans while raising the bar for how good that work needs to be. Agents are structurally built to rely on humans for direction; without someone deciding what matters and how to make it better, they produce mediocre results. To position yourself to thrive in an AI-native workplace, Dan says, use new models to do the tasks you’re already good at, and you’ll be more in demand than ever.&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;Miss an episode? Catch up on Dan’s recent conversations with LinkedIn cofounder &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/podcast/reid-hoffman-makes-five-predictions-about-ai-in-2026" rel="noopener noreferrer" target="_blank"&gt;Reid Hoffman&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;; the team that built Claude Code, &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/podcast/how-to-use-claude-code-like-the-people-who-built-it" rel="noopener noreferrer" target="_blank"&gt;Cat Wu&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; and &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/podcast/how-to-use-claude-code-like-the-people-who-built-it" rel="noopener noreferrer" target="_blank"&gt;Boris Cherny&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;; Vercel cofounder &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/podcast/vercel-s-guillermo-rauch-on-what-comes-after-coding" rel="noopener noreferrer" target="_blank"&gt;Guillermo Rauch&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;; podcaster &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/podcast/dwarkesh-patel-s-quest-to-learn-everything" rel="noopener noreferrer" target="_blank"&gt;Dwarkesh Patel&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;; and others, and learn how they use AI to think, create, and relate.—&lt;em&gt;&lt;u&gt;&lt;a href="https://every.to/@laura_27bbaf_1" rel="noopener noreferrer" target="_blank"&gt;Laura Entis&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;&lt;/p&gt;&lt;h3&gt;&lt;hr class="quill-line"&gt;&lt;/h3&gt;&lt;h3&gt;Signal&lt;/h3&gt;&lt;h4&gt;The Pope takes on the means of AI production&lt;/h4&gt;&lt;p&gt;When &lt;strong&gt;Pope Leo XIV&lt;/strong&gt;’s encyclical on AI, &lt;em&gt;&lt;u&gt;&lt;a href="https://www.vatican.va/content/leo-xiv/en/encyclicals/documents/20260515-magnifica-humanitas.html" rel="noopener noreferrer" target="_blank"&gt;Magnifica Humanitas&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;, hit the internet a little after 6 a.m. on Monday, the first thing I did was give it to an AI. &lt;/p&gt;&lt;p&gt;I’d been waiting on the Pope’s first major written teaching with the bated breath of a left-leaning agnostic secular humanist &lt;u&gt;&lt;a href="https://katieparrott.substack.com/p/friday-night-bible-study-with-chatgpt" rel="noopener noreferrer" target="_blank"&gt;amateur Bible scholar&lt;/a&gt;&lt;/u&gt; slash knowledge worker in the AI economy. AI, labor, and the Book of Nehemiah, in one document? I’m not sure there’s ever been a more Katie Parrott-coded text. &lt;/p&gt;&lt;p&gt;Nevertheless, I gave AI the first crack at it. I had Andy, Every’s in-house editorial assistant, use Claude design to turn it into a comic-book infographic with the need-to-know information for the Every team. Our head of tech consulting, &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@mike_2114" rel="noopener noreferrer" target="_blank"&gt;Mike Taylor&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;, said the comic helped him wrap his head around the argument as a non-believer. Praise the Lord.&lt;/p&gt;&lt;div class="quill-block-image" id="quill-block-image-1779901913097-yhneoz3pm" data-source="{&amp;quot;dom_id&amp;quot;:&amp;quot;quill-block-image-1779901913097-yhneoz3pm&amp;quot;,&amp;quot;link&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4276/optimized_fb9944e2-6ab3-486b-b26f-2d4a38d23004.png&amp;quot;,&amp;quot;image&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4276/optimized_fb9944e2-6ab3-486b-b26f-2d4a38d23004.png&amp;quot;,&amp;quot;caption&amp;quot;:&amp;quot;Page 1 of the Magnifica Humanitas comic book graphic created by Andy using Claude Design. (Image courtesy of Katie Parrott.)&amp;quot;,&amp;quot;error&amp;quot;:null}"&gt;&lt;div&gt;&lt;a href="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4276/optimized_fb9944e2-6ab3-486b-b26f-2d4a38d23004.png" target="_blank" rel="noopener noreferrer"&gt;&lt;img src="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4276/optimized_fb9944e2-6ab3-486b-b26f-2d4a38d23004.png" alt="Page 1 of the Magnifica Humanitas comic book graphic created by Andy using Claude Design. (Image courtesy of Katie Parrott.)"&gt;&lt;/a&gt;&lt;figcaption class="quill-image-caption"&gt;Page 1 of the Magnifica Humanitas comic book graphic created by Andy using Claude Design. (Image courtesy of Katie Parrott.)&lt;/figcaption&gt;&lt;/div&gt;&lt;/div&gt;&lt;p&gt;&lt;br&gt;&lt;/p&gt;&lt;p&gt;I can hear the objection, because I had it myself: Isn’t it a little rich—in bad taste, even—to run an encyclical on AI through an AI? To use the machine to skim the Pope’s warning about the machine? Feeling guilty, I closed the comic and read the whole thing myself, slowly.&lt;/p&gt;&lt;p&gt;The penance turned out to be unnecessary, because the guilt rests on a false premise. &lt;em&gt;Magnifica Humanitas&lt;/em&gt; is not anti-AI. That’s not to say His Holiness doesn’t see something in AI to worry about, but the things that he’s worried about have more to do with the systems of power surrounding AI than they do with AI itself. &lt;/p&gt;&lt;p&gt;The timing of &lt;em&gt;Magnifica Humanitas’&lt;/em&gt;s appearance is a heck of a thing, because five days earlier, we published our own encyclical of sorts: &lt;u&gt;&lt;a href="https://every.to/p/after-automation" rel="noopener noreferrer" target="_blank"&gt;“After Automation,”&lt;/a&gt;&lt;/u&gt; Dan’s case that as AI makes yesterday’s expertise cheap, human judgment becomes the scarce, valuable thing. More machine, more human work. &lt;/p&gt;&lt;p&gt;I’ve had these two voices—my boss and Catholicism’s boss—in my head for a few days now. I even made an app where AI versions of them argue about AI and the future of work, just for fun. I want to believe my boss when he says AI will make human judgment more valuable, not less. Catholicism’s boss doesn’t exactly disagree. He just asks the question hiding underneath: valuable to whom? &lt;/p&gt;&lt;h4&gt;Human dignity in the new Industrial Revolution&lt;/h4&gt;&lt;p&gt;The Holy Father formerly known as &lt;strong&gt;Richard Prevost&lt;/strong&gt; took the name “Leo” for a reason. In 1891, the previous Pope Leo, &lt;strong&gt;Leo XIII&lt;/strong&gt;, wrote &lt;em&gt;&lt;u&gt;&lt;a href="https://www.vatican.va/content/leo-xiii/en/encyclicals/documents/hf_l-xiii_enc_15051891_rerum-novarum.html" rel="noopener noreferrer" target="_blank"&gt;Rerum Novarum&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;, the letter where the Church took the side of workers against industrial capital. His indictment: The wealth made by the many had pooled in the hands of a few, leaving workers with “a yoke little better than that of slavery itself.” The indictment came with a policy agenda: a living wage, humane hours, rest, limits on child and exhausting labor, the right of workers to form unions and mutual-aid societies, and a state willing to step in when the poor were crushed by market power.&lt;/p&gt;&lt;p&gt;Our present Leo signed &lt;em&gt;Magnifica Humanitas&lt;/em&gt; on the 135th anniversary of the previous Leo’s letter. Translation: AI is the new factory, and the Church means to do for the large language model what it once tried to do for the assembly line. The present policy agenda: Regulate data as a shared good; make algorithmic decisions transparent, contestable, and accountable; design workplace systems around human dignity rather than machine-speed productivity; invest in retraining and access; use taxation, social protection, and industrial policy to spread the gains; protect children from extractive platforms; and keep lethal decisions out of automated hands. &lt;/p&gt;&lt;p&gt;A key part of the argument in &lt;em&gt;Magnifica Humanitas &lt;/em&gt;is built on a philosophical principle older than capitalism: the &lt;em&gt;universal destination of goods&lt;/em&gt;. It’s the idea, developed in Catholic teaching from Aquinas forward, that the world’s resources are intended for everyone, and private ownership is a stewardship arrangement rather than carte blanche. Bible readers will recognize the spirit of it in Acts: The first followers of Jesus “had all things in common,” selling what they owned and giving “to each as any had need” (Acts 2:44–45 NRSVUE)—a line that would echo, centuries later, through everyone’s favorite, non-divisive German philosopher &lt;strong&gt;Karl Marx&lt;/strong&gt;. Leo XIV updates it for the era of the data center. He extends “goods” to include “patents, algorithms, digital platforms, technological infrastructure and data,” and warns that when those stay “concentrated in the hands of a few,” the result is “a new imbalance” (¶67).&lt;/p&gt;&lt;p&gt;The models you hand your work to were trained on the collective writing of everyone who ever put words down—yours and mine included. We’ve built the material underlying this technology collectively. But according to Leo XIV, the value is being disproportionately captured by “private, often transnational, parties” whose resources “surpass those of many Governments” (¶5). A pope is describing the means of production—and the fact that the people whose livelihoods now run on them don’t own a share.&lt;/p&gt;&lt;h4&gt;A Pope and a CEO walk into a discourse&lt;/h4&gt;&lt;p&gt;Dan’s focus in &lt;u&gt;&lt;a href="https://every.to/p/after-automation" rel="noopener noreferrer" target="_blank"&gt;“After Automation”&lt;/a&gt;&lt;/u&gt; is mostly on the individual. What can &lt;em&gt;I&lt;/em&gt; do to stay ahead and make the most of AI progress? Answer: Become the framer—the person in charge of deciding what’s worth doing, and why. His Holiness takes the collective view, and reading their perspectives together is what makes Dan’s piece feel both right and incomplete at once.&lt;/p&gt;&lt;p&gt;Becoming the framer is the correct individual strategy. It’s also a move that only pays off if you’re positioned to make it—with savings to play with, time to learn to use the tool well, and somewhere soft to land if you leap. I had all three when I was first experimenting with AI. The same model, handed to a single mother working two jobs to pay for childcare, won’t have the same effect. Access to AI multiplies what you already have, and the machine doing the multiplying still belongs to someone else.&lt;/p&gt;&lt;h4&gt;What you can do&lt;/h4&gt;&lt;p&gt;Leo’s question doesn’t resolve into action items, but there are a few moves available to anyone who works in or around AI.&lt;/p&gt;&lt;ul&gt;&lt;li&gt;&lt;strong&gt;Know what (and who) you’re depending on. &lt;/strong&gt;Start with your own tools. List the models, agents, APIs, and platforms that sit between you and your work. Ask what happens if the price changes, access disappears, terms shift, or your data gets locked in. Keep the parts of your work that create lasting value—notes, prompts, workflows, client context, and taste—in places you control.&lt;/li&gt;&lt;li&gt;&lt;strong&gt;Bring ownership and governance into decisions you already touch&lt;/strong&gt;. When a team pilots a tool, ask about more than time saved. Ask who benefits from that saved time, whose work changes, what needs human review, and what should not be automated. Put those questions into kickoff docs, vendor decisions, retros, and performance reviews.&lt;/li&gt;&lt;li&gt;&lt;strong&gt;Use your position to set the standard. &lt;/strong&gt;If you are reading this, you are on the first wave of AI adoption, whether it feels that way or not. You are testing tools, designing workflows, advising clients, and modeling what “good AI use” looks like. Take that responsibility seriously. The standard we set now is the baseline for everyone else who comes after. &lt;/li&gt;&lt;/ul&gt;&lt;p&gt;AI has given me a working life I love, on loan from a commons everyone built and a few companies own. Dan’s question I can answer by myself, which is what makes it comfortable. Leo’s I can’t answer alone, and neither can you. What we can do is stop seeing our own good luck as proof the system is fair, and keep the big question on the table: Who owns the machine that makes my work valuable, and at what cost? &lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;h2&gt;&lt;strong&gt;Log on&lt;/strong&gt;&lt;/h2&gt;&lt;p&gt;We host &lt;u&gt;&lt;a href="https://every.to/events" rel="noopener noreferrer" target="_blank"&gt;camps and workshops&lt;/a&gt;&lt;/u&gt; on topics like &lt;u&gt;&lt;a href="http://youtube.com/watch?v=7YUBxMTF1Tc&amp;amp;time_continue=3&amp;amp;source_ve_path=NzY3NTg&amp;amp;embeds_referring_euri=https%3A%2F%2Fevery.to%2F" rel="noopener noreferrer" target="_blank"&gt;compound engineering&lt;/a&gt;&lt;/u&gt; and &lt;u&gt;&lt;a href="https://www.youtube.com/watch?v=oEvjbPwGwnc" rel="noopener noreferrer" target="_blank"&gt;writing with AI&lt;/a&gt;&lt;/u&gt; to share what we’ve learned from training teams at companies like the &lt;em&gt;&lt;u&gt;&lt;a href="https://every.to/on-every/the-next-chapter-of-every-consulting" rel="noopener noreferrer" target="_blank"&gt;New York Times&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;&lt;u&gt; &lt;a href="https://every.to/on-every/the-next-chapter-of-every-consulting" rel="noopener noreferrer" target="_blank"&gt;and leading hedge funds&lt;/a&gt;&lt;/u&gt;, and by using and experimenting with AI every day ourselves.&lt;/p&gt;&lt;h5&gt;&lt;strong&gt;Upcoming event&lt;/strong&gt;&lt;/h5&gt;&lt;ul&gt;&lt;li&gt;&lt;u&gt;&lt;a href="https://every.to/events/executive-AI-sessions" rel="noopener noreferrer" target="_blank"&gt;Executive AI Sessions&lt;/a&gt;&lt;/u&gt;: On June 2, head of consulting &lt;u&gt;&lt;a href="https://every.to/@natalia_2944" rel="noopener noreferrer" target="_blank"&gt;Natalia Quintero&lt;/a&gt;&lt;/u&gt; hosts a live webinar introducing &lt;u&gt;&lt;a href="https://every.to/consulting" rel="noopener noreferrer" target="_blank"&gt;Every Consulting&lt;/a&gt;&lt;/u&gt;’s new offering for leadership teams navigating AI adoption—built on the playbook we’ve been running with executive clients for months. &lt;u&gt;&lt;a href="https://every.to/events/executive-AI-sessions" rel="noopener noreferrer" target="_blank"&gt;Learn more and register&lt;/a&gt;&lt;/u&gt;.&lt;/li&gt;&lt;/ul&gt;&lt;h5&gt;&lt;strong&gt;In New York City&lt;/strong&gt;&lt;/h5&gt;&lt;ul&gt;&lt;li&gt;&lt;u&gt;&lt;a href="https://luma.com/2o67t7ob" rel="noopener noreferrer" target="_blank"&gt;Every 🤝 IRL&lt;/a&gt;&lt;/u&gt;: Join us at the Every brownstone in Brooklyn on June 3 during New York Tech Week for a subscriber-only meetup celebrating the Every community over drinks and conversation. &lt;u&gt;&lt;a href="https://luma.com/2o67t7ob" rel="noopener noreferrer" target="_blank"&gt;Learn more and RSVP&lt;/a&gt;&lt;/u&gt;.&lt;/li&gt;&lt;/ul&gt;&lt;h2&gt;&lt;hr class="quill-line"&gt;&lt;/h2&gt;&lt;h2&gt;Inside Every&lt;/h2&gt;&lt;h4&gt;Use Codex for knowledge work like the Every team &lt;/h4&gt;&lt;p&gt;If you’re anything like me, modern knowledge work has started to feel a little like being your computer’s errand girl. Move the Slack thread into Notion. Copy the dashboard number into the spreadsheet. Find the latest version of a draft in a field of them. Gather the eight inputs for one report, each living on a different work surface.&lt;/p&gt;&lt;p&gt;Codex changes all that. OpenAI’s agentic workspace can read across the apps, files, and tools you connect, gather the context you would otherwise have to chase down yourself, and turn scattered inputs into a draft, brief, plan, or workflow you can review.&lt;/p&gt;&lt;p&gt;The Every team is so Codex-pilled, we built an entire &lt;u&gt;&lt;a href="https://every.to/guides/codex-for-knowledge-work?source=post_button" rel="noopener noreferrer" target="_blank"&gt;9,000-plus-word guide&lt;/a&gt;&lt;/u&gt; about how we use it. It walks through how to set Codex up, what to hand off, what to keep close, and how to turn one-off tasks into reusable workflows. A &lt;u&gt;&lt;a href="https://x.com/nickbaumann_/status/2059434665044570348" rel="noopener noreferrer" target="_blank"&gt;member of the Codex team&lt;/a&gt;&lt;/u&gt; at OpenAI said he’s sharing it with his agent, so there’s truly something for everybody—and every-bot-y. &lt;/p&gt;&lt;div class="quill-block-image" id="quill-block-image-1779901913129-hmya7eqaq" data-source="{&amp;quot;dom_id&amp;quot;:&amp;quot;quill-block-image-1779901913129-hmya7eqaq&amp;quot;,&amp;quot;link&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4276/optimized_68a4f7a4-ed21-47a4-afac-2894639c4fdf.png&amp;quot;,&amp;quot;image&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4276/optimized_68a4f7a4-ed21-47a4-afac-2894639c4fdf.png&amp;quot;,&amp;quot;caption&amp;quot;:&amp;quot;Nick Baumann (@nickbaumann_) from the Codex team gives our Codex for knowledge work guide the thumbs up. (Image courtesy of Katie Parrott.)&amp;quot;,&amp;quot;error&amp;quot;:null}"&gt;&lt;div&gt;&lt;a href="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4276/optimized_68a4f7a4-ed21-47a4-afac-2894639c4fdf.png" target="_blank" rel="noopener noreferrer"&gt;&lt;img src="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4276/optimized_68a4f7a4-ed21-47a4-afac-2894639c4fdf.png" alt="Nick Baumann (@nickbaumann_) from the Codex team gives our Codex for knowledge work guide the thumbs up. (Image courtesy of Katie Parrott.)"&gt;&lt;/a&gt;&lt;figcaption class="quill-image-caption"&gt;Nick Baumann (@nickbaumann_) from the Codex team gives our Codex for knowledge work guide the thumbs up. (Image courtesy of Katie Parrott.)&lt;/figcaption&gt;&lt;/div&gt;&lt;/div&gt;&lt;p&gt;&lt;br&gt;&lt;/p&gt;&lt;p&gt;If you want to know even more about how the Every team uses Codex to accelerate our work, we’re hosting a two-hour &lt;u&gt;&lt;a href="https://every.to/events/codex-camp-our-power-user-guide" rel="noopener noreferrer" target="_blank"&gt;Codex Camp&lt;/a&gt;&lt;/u&gt; on June 12 where Dan and the Every team will be sharing our favorite hacks for working with Codex. The camp (and the guide) are for subscribers only, so subscribe today to access the full guide and register for the camp. Bring your favorite workflows. &lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;&lt;strong&gt;&lt;em&gt;&lt;a href="https://every.to/@katie.parrott12" rel="noopener noreferrer" target="_blank"&gt;Katie Parrott&lt;/a&gt;&lt;/em&gt;&lt;/strong&gt; &lt;em&gt;is a staff writer at Every. You can read more of her work in&lt;/em&gt; &lt;em&gt;&lt;a href="https://katieparrott.substack.com/" rel="noopener noreferrer" target="_blank"&gt;her newsletter&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;To read more essays like this, subscribe to &lt;u&gt;&lt;a href="https://every.to/subscribe" rel="noopener noreferrer" target="_blank"&gt;Every&lt;/a&gt;&lt;/u&gt;, and follow us on X at &lt;u&gt;&lt;a href="http://twitter.com/every" rel="noopener noreferrer" target="_blank"&gt;@every&lt;/a&gt;&lt;/u&gt; and on &lt;u&gt;&lt;a href="https://www.linkedin.com/company/everyinc/" rel="noopener noreferrer" target="_blank"&gt;LinkedIn&lt;/a&gt;&lt;/u&gt;.&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;We &lt;u&gt;&lt;a href="https://every.to/studio" rel="noopener noreferrer" target="_blank"&gt;build AI tools&lt;/a&gt;&lt;/u&gt; for readers like you. Write brilliantly with &lt;/em&gt;&lt;strong&gt;&lt;em&gt;&lt;u&gt;&lt;a href="https://writewithspiral.com/" rel="noopener noreferrer" target="_blank"&gt;Spiral&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;&lt;/strong&gt;&lt;em&gt;. Organize files automatically with &lt;/em&gt;&lt;strong&gt;&lt;em&gt;&lt;u&gt;&lt;a href="https://makeitsparkle.co/?utm_source=everyfooter" rel="noopener noreferrer" target="_blank"&gt;Sparkle&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;&lt;/strong&gt;&lt;em&gt;. Deliver yourself from email with &lt;/em&gt;&lt;strong&gt;&lt;em&gt;&lt;u&gt;&lt;a href="https://cora.computer/" rel="noopener noreferrer" target="_blank"&gt;Cora&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;&lt;/strong&gt;&lt;em&gt;. Dictate effortlessly with &lt;/em&gt;&lt;strong&gt;&lt;em&gt;&lt;u&gt;&lt;a href="https://monologue.to/" rel="noopener noreferrer" target="_blank"&gt;Monologue&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;&lt;/strong&gt;&lt;em&gt;. Collaborate with agents on documents with &lt;/em&gt;&lt;strong&gt;&lt;em&gt;&lt;a href="https://www.proofeditor.ai/" rel="noopener noreferrer" target="_blank"&gt;Proof&lt;/a&gt;&lt;/em&gt;&lt;/strong&gt;&lt;em&gt;. &lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;For sponsorship opportunities, reach out to sponsorships@every.to.&lt;/em&gt;&lt;/p&gt;&lt;div class="quill-button" data-source="{&amp;quot;id&amp;quot;:&amp;quot;quill-button-1769187301610&amp;quot;,&amp;quot;url&amp;quot;:&amp;quot;https://every.to/subscribe?source=post_button&amp;quot;,&amp;quot;text&amp;quot;:&amp;quot;Subscribe&amp;quot;}" id="quill-button-1769187301610"&gt;&lt;a href="https://every.to/subscribe?source=post_button"&gt;Subscribe&lt;/a&gt;&lt;/div&gt;</description>
      <author>Katie Parrott / Context Window</author>
      <pubDate>2026-05-27 17:00:00 -0400</pubDate>
      <guid>https://every.to/context-window/after-after-automation</guid>
      <link>https://every.to/context-window/after-after-automation</link>
    </item>
    <item>
      <title>Transcript: ‘We Automated Everything With AI and Tripled Our Headcount’</title>
      <description>&lt;table&gt;&lt;tr&gt;&lt;td&gt;&lt;img alt="AI &amp;amp; I" src="https://d24ovhgu8s7341.cloudfront.net/uploads/publication/logo/97/small_ai_and_i_cover_1.png" /&gt;&lt;/td&gt;&lt;td&gt;&lt;/td&gt;&lt;td&gt;&lt;table&gt;&lt;tr&gt;&lt;td&gt;by &lt;a href="https://every.to/@danshipper" itemprop="name"&gt;Dan Shipper&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;in &lt;a href="https://every.to/podcast"&gt;AI &amp;amp; I&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;p&gt;&lt;strong&gt;The transcript of &lt;em&gt;&lt;u&gt;&lt;a href="https://every.to/podcast" rel="noopener noreferrer" target="_blank"&gt;AI &amp;amp; I&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;, in which Every COO Brandon Gell interviews me about &lt;a href="https://every.to/p/after-automation" rel="noopener noreferrer" target="_blank"&gt;“After Automation”&lt;/a&gt;—my 8,000-word essay on why AI creates more work for humans—is below. Watch on &lt;a href="https://x.com/danshipper/status/2059673326247625084" rel="noopener noreferrer" target="_blank"&gt;X&lt;/a&gt; or &lt;a href="https://youtu.be/dCmOTURRf1Y" rel="noopener noreferrer" target="_blank"&gt;YouTube&lt;/a&gt;, or listen on &lt;a href="https://open.spotify.com/episode/58rbN4WgcbESfA37XDik7C?si=U0ezF-mZRH2qoJR9vCxb1Q" rel="noopener noreferrer" target="_blank"&gt;Spotify&lt;/a&gt; or &lt;a href="https://podcasts.apple.com/us/podcast/we-automated-everything-with-ai-and-tripled-our-headcount/id1719789201?i=1000769857409" rel="noopener noreferrer" target="_blank"&gt;Apple Podcasts&lt;/a&gt;.&lt;/strong&gt;&lt;/p&gt;&lt;h4&gt;&lt;strong&gt;Timestamps&lt;/strong&gt;&lt;/h4&gt;&lt;p&gt;1. Introduction: 00:00:51&lt;/p&gt;&lt;p&gt;2. The AI paradox: more automation, more human work: 00:05:51&lt;/p&gt;&lt;p&gt;3. How AI makes yesterday’s expert competence cheap: 00:10:00&lt;/p&gt;&lt;p&gt;4. AI can act autonomously but it does not have agency: 00:18:00&lt;/p&gt;&lt;p&gt;5. Why Dan is all in on AGI: 00:20:39&lt;/p&gt;&lt;p&gt;6. AI layoffs are a lie: 00:21:57&lt;/p&gt;&lt;p&gt;7. Ride the models and you’ll be fine: 00:25:42&lt;/p&gt;&lt;p&gt;8. How to use AI as a long-form features editor: 00:35:30&lt;/p&gt;&lt;h4&gt;&lt;strong&gt;Transcript&lt;/strong&gt;&lt;/h4&gt;&lt;p&gt;&lt;strong&gt;Brandon&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;You prompt AI to do something, it blows your mind. You feel inadequate. You feel like, “Oh my God, this thing’s gonna take my job.” And then it stops working and it looks back at you and says, “What should I do next?”&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Dan&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;The further away an agent gets from a human, the less valuable it is. If you just ride the models, you’re gonna be fine.&lt;/p&gt;&lt;p&gt;If you care about leading a really ambitious life, I truly think that this is going to make that more possible for more people.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Brandon&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;So we’re here because we’re going to flip the script a little bit. I’m going to be interviewing Dan—&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Dan&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;Sick.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Brandon&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;—about the piece that he published yesterday, May 21st. We’re going to try to understand why he wrote it and what’s underneath his reasoning. There’s going to be some conflict. I’m going to fight with him on it—&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Dan&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;Let’s go. Let’s fight.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Brandon&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;—and see, bring in some of my opinions, which are more or less aligned, but trying to understand: does this piece reflect the future in 10 years, in five years?&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Dan&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;And who are you again?&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Brandon&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;I’m Brandon. I’m our COO, and that’s it.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Dan&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;So the piece is called “After Automation,” and it comes from this feeling that I have—there’s a video about this, and there’s a piece, but just for people who haven’t seen either of those things.&lt;/p&gt;&lt;p&gt;It comes from this feeling that at Every we are as AI-native, as agent-native, as it gets. If you swing a stick around in our Slack, you’re as likely to hit a human as you are an agent. Everyone’s using Claude Code and Codex and all these tools to do their job every day.&lt;/p&gt;&lt;p&gt;And yet it feels like there’s more human work to do than ever. In fact, since the GPT-3 days, we’ve grown from four people to around 30 people, and we’re hiring more now. So it came from me looking at that and looking at the environment and thinking, “What’s going on?”&lt;/p&gt;&lt;p&gt;Because the whole information environment—if you look at it, Dario is out there saying half of entry-level white collar jobs may be wiped out. Even people like Ken Griffin from Citadel—you can tell he just had this moment where someone showed him AI doing an advanced data or finance question, and he was like, “Holy shit, that’s what I would pay PhDs to do for me, and it just did it.”&lt;/p&gt;&lt;p&gt;I feel like I’m watching a lot of people who maybe don’t have a ton of experience with agents, and don’t have a ton of experience with the curve of improvement that we’ve been riding for the last three or three-and-a-half years, hit it for the first time—and then come to all these conclusions about, “Oh my God, all work is going away. We’re not gonna have jobs.”&lt;/p&gt;&lt;p&gt;And I’m sitting here thinking, actually, your intuitions when you first see a technology like this are usually very off. We’ve seen over and over again that Every is a very good bellwether for where things are going because it’s a group of early adopters. We have people doing all sorts of work internally, and if something works here, there’s a good bet it’s going to spread to other businesses that are adjacent to ours.&lt;/p&gt;&lt;p&gt;When I look around at Every, I see so much automation, and I also see way more human work. The whole piece is saying, “Here’s the current state of work with agents”—and then pulling apart that paradox and explaining: why does more automation mean more work?&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Brandon&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;When I read the piece, there wasn’t an explicit call to action in it, but I sort of felt this call to action of: there is actually a massive amount of hope right now in a world filled with a lot of doomers, and this is why.&lt;/p&gt;&lt;p&gt;But I’m going to come out of the gate and ask you a devil’s advocate question, which is: a couple of hours before you published this piece, the CEO of ClickUp came out with this long tweet about why he fired—I think it was around 22% of his workforce.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Dan&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;I don’t think it was in the thousands, but yes, it was a lot of his workforce.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Brandon&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;Yeah. So my question to you is, in a business like Every—we’re growing super fast. What you wrote makes a lot of sense to me. And theoretically it makes a ton of sense: AI is not autonomous right now, it has to be told what to do and then checked, we need that sandwich you described in the piece. But in a business that is 8,000 or 10,000 people, that is mature and has built ways of managing—SOPs for managing their business—does this manifesto and this thesis still hold true?&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Dan&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;That’s a really good question. There are a couple of different questions here. The first thing I want to do is lay out the argument. Why does automation make more work?&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Brandon&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;I’m sure many people listening also haven’t read it. Take a second to explain that in detail.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Dan&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;The idea is that the way AI works and the way it functions in the workplace is: AI makes yesterday’s expert competence cheap. By that I mean AI is trained on all of our outputs—all of the code and the writing and the design and the decision-making and everything that’s ever been written—and it makes that available to everyone for very cheap.&lt;/p&gt;&lt;p&gt;Anyone now with a prompt can use yesterday’s competence to solve a programming problem, build an app, or write a piece—a report, a YouTube thumbnail. The interesting thing is that when expert competence is available for cheap, it gets widely adopted. Everyone starts to do it.&lt;/p&gt;&lt;p&gt;We see this internally. Everyone’s making pull requests, and there’s a lot of, “Holy shit, this is crazy.” I’m making pull requests, ops people are making pull requests, engineers are writing essays. There’s all this line-crossing—non-experts doing the things that experts used to do. And that feels very threatening to experts, who are like, “Well, what’s my job going to be now?”&lt;/p&gt;&lt;p&gt;What’s interesting is because these tools are trained on outputs, trained on yesterday’s data, the stuff they do with a default prompt all looks kind of similar and is all kind of right for the current situation, but not actually totally right. So you flood the zone with tons of stuff that’s close but not quite right. And then you need an expert to come in.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Brandon&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;There’s a lot of that at Every too. A lot of people doing what seems like great work, and then you go under the hood and you’re like, “This isn’t quite right. Maybe the expert should do it.”&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Dan&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;Yeah, exactly. And I’m definitely—this is coming from personal experience.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Brandon&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;I have pushed so many PRs where I’m like, “Willy, I literally have no idea if this works, but here you go.” And then he’s like—&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Dan&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;“This is a good idea, but I just completely redid it.”&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Brandon&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;Exactly.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Dan&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;That’s exactly the kind of thing I’m talking about. It’s kind of right, it’s close, but it’s actually not quite right and you need an expert to figure it out.&lt;/p&gt;&lt;p&gt;What’s interesting is when you flood the zone with all that stuff, what used to be expensive because it’s expert competence is now cheap, and now it all looks the same. Everything gets devalued. You get this abundance of stuff that looks like expensive work—code, essays, whatever—but it’s all kind of similar and all not quite right for the situation, so its value gets a lot lower. Immediately lower.&lt;/p&gt;&lt;p&gt;And then what happens is you actually get more demand for experts to come in and help take that stuff that’s being produced by people—you have good ideas, for example—and get that idea across the finish line. That usually looks like experts building systems to shepherd the broadly produced work into something actually useful.&lt;/p&gt;&lt;p&gt;An example: we have repo rules and review guidelines so that before Willy sees a PR, it’s gone through a bunch of processes to make sure it’s actually reasonably good. We have the same thing on the editorial side. And then there’s a lot of demand for experts to use these tools—now that the floor is a lot higher—to make stuff that could never have been made before. Like Kieran, who just built an entire inbox end to end in about a month or two. That’s completely impossible without these tools.&lt;/p&gt;&lt;p&gt;So there’s this really interesting thing that happens: even as you automate, the automation produces a glut of work that’s all okay, all reasonably good. That work is all very similar and not quite a fit for the actual situation, and that increases the demand for experts who can make it actually good, actually different, actually appropriate for the live situation as it is right now.&lt;/p&gt;&lt;p&gt;I think that’s something people don’t quite understand, especially when they first encounter a language model or an agent that can do something. They see it and they’re like, “Holy shit, it just does everything.” And the reality is it’s incredibly good. It’s amazing. It totally changes how we do work.&lt;/p&gt;&lt;p&gt;Our experience so far at Every is the further away an agent gets from a human, the less valuable it is. The human connection with an agent to actually do the work is the most important thing for making it work well.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Brandon&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;Experts are more important than ever because they lay the groundwork for an agent to do amazing work.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Dan&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;Yeah.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Brandon&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;And only then can you have the other humans take that agent and do work that levels them up. There was a point where we were thinking about this piece—Dan was drafting it—where the title was “The Tide Is Rising,” and that was trying to emote this idea that the tide is rising. We are all able to do more work, better work, but our eyes, whether you’re an expert or not, are always a little bit above where that waterline is.&lt;/p&gt;&lt;p&gt;And I really liked the end of the piece, where you describe Achilles sprinting ahead of the tortoise, which according to Zeno’s paradox shouldn’t happen. But in this world, it actually does. You prompt AI to do something, it blows your mind. You feel inadequate. You feel like, “Oh my God, this thing’s gonna take my job.” And then it stops working and it looks back at you and says, “What should I do next?”&lt;/p&gt;&lt;p&gt;I think that, until we’ve figured out AGI—and maybe even after that, probably for a very, very long time after that—it will always be looking back at us and asking us for direction.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;(00:10:00)&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Dan&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;That’s basically the core of the argument. Because you can say, “Oh yeah, Dan, it maybe is true now that it increases demand for experts, but this stuff’s gonna get good enough that it won’t. Just look at the benchmarks.”&lt;/p&gt;&lt;p&gt;There’s a whole section in the piece about this: if you actually do look at the benchmarks, they are improving exponentially. But when you look at them closely, once you saturate a benchmark, it’s very easy to unsaturate it. It’s very easy to find a new frame for a particular type of problem that is slightly larger, slightly broader, that zeros it out. So while it is making exponential progress, that doesn’t mean it is equivalent to human capability.&lt;/p&gt;&lt;p&gt;It’s a very hard problem, and one of the reasons it’s so hard is anything you say about what you can do differently than the model is going to be wrong—because once it’s articulated, once it’s specified, a model can hill-climb on it. A model’s going to get better at it.&lt;/p&gt;&lt;p&gt;We make this weird subtle mistake where we identify a set of tasks and say, “This is all that humans can do that models can’t do,” and then models just do it better, and then you’re like, “Oh my God, what do I do?” The mistake is there’s actually a lot of stuff you do that can’t be articulated in a clean frame. Every time you try, you just get panicked and confused.&lt;/p&gt;&lt;p&gt;If you step back, the fundamental thing that keeps the separation between humans and agents is we are building agents to do things that we want them to do. No matter how powerful they get, all of the economic and psychological and technological forces are pushing the progress of AI toward a place where, no matter what it does, it’s looking back at you to decide what is valuable.&lt;/p&gt;&lt;p&gt;Even after we get to AGI, theoretically AGI is going to do that too. If we thought it wasn’t going to do that, we wouldn’t build it. And that keeps the gap between humans and AI.&lt;/p&gt;&lt;p&gt;A good example of this is the difference between something that can do a task really well and something that just has its own self-motivated stuff that it wants to do. You have a kid. Codex can write a report much better than Isaiah can, but Isaiah has very strong wants and needs. You can try to get him to do what you want, and it’ll work sometimes—but he’s just this self-generating process that does stuff because he wants to.&lt;/p&gt;&lt;p&gt;If you’ve ever used any of these tools, you know they’re not built to work that way. They can push back a little bit, but they don’t have this playful, “I just want to do stuff because I’m into it,” that humans have. And again, we’re getting into territory where I’m saying things that, once clearly articulated, models can do—but you have to be comfortable with the fact that there are things you can do and things you are that you can’t fully articulate.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Brandon&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;It is also inside of that play—and that rejection—where you have autonomy.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Dan&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;Yeah.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Brandon&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;And it will be a very scary moment when these models can do that. I think there’s a question of whether they even can, because they rely on training data—and maybe there’s a world in which they are continually learning and we lose control of them and they get access to training data that we don’t want them to have. But until that time, there’s probably a good argument that they can’t reject what we’re saying and therefore can’t be truly autonomous. Autonomy needs to be: I’ve asked you to analyze this CSV, and it says no—because this is a better idea.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Dan&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;Yeah, and I would substitute a better word here. I think “agent” is very confusing because it implies agency, but agent means something that acts on behalf of someone else. I think these are agents that are getting very good at being autonomous in the sense that if I send you out on a task—whatever that task is, even “disagree with every single thing I say” or “go off and find a new idea”—they’re getting very good at that.&lt;/p&gt;&lt;p&gt;But that is very different from having agency, which is what even the smallest child has. And I don’t think there’s a lot of incentive to build that. Because, okay, you sit down at your computer and say, “Hey, let’s get to work,” and the agent’s like, “Nah, I’m playing.”&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Brandon&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;It needs to be able to do that in order to do things that are scary to us.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Dan&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;Yeah, that’s what I think. And there’s obviously a gigantic literature on LessWrong and other places about why it’s impossible to prove they’re never going to do that. But my counter to that is the evidence: if you look at the development of these things, their whole lineage is toward being more compliant. I think the entire industry is incentivized to do that, and I see no reason to doubt that’s going to continue.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;(00:20:00)&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Brandon&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;We’d have to develop something like your definition of AGI, which is a good question of whether that’s actually possible. Maybe you should explain to everyone what AGI means to you.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Dan&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;I think a good definition of AGI is any agent that you never turn off—that it makes economic sense to keep running all the time, and “all the time” in the sense of actively generating tokens, actively doing tasks for you without you ever turning it off or having to re-prompt it. You can guide it, but the idea is it’s valuable enough that it can just keep running all the time.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Brandon&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;Okay. I want one-word answers for the next two questions. Do you think that will happen?&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Dan&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;Yes.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Brandon&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;Do you think that is a good thing?&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Dan&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;Yes.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Brandon&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;Explain your reasoning for the second answer. Because to me, that seems to be where things start to get a little off the rails—where it makes economic sense for these things to run all the time. Because then I start to think: okay, it’s actually valid that the ClickUp guy just fired 20% of his team.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Dan&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;We should definitely go back to the ClickUp guy.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Brandon&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;Let’s go back to ClickUp guy. What’s his name?&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Dan&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;“ClickUp guy” is good. But before we get there, the thing that’s important to not fall into when you project out like this is: everybody will have access to this. For another, the rate of change, even when crazy new technology is available, is actually a lot slower than you would expect.&lt;/p&gt;&lt;p&gt;As part of this piece I wanted to see how this works. I know how it works in expert knowledge work, in fast-moving stuff. I know how it works if you’re a customer service manager type. But how does AI actually affect your job if you’re a customer service person in Omaha working in a call center? Because those are the most at-risk employees—that would be the default example to bring up. So I just had Codex and Claude Code scrape all of Reddit and lots of places where customer service reps post.&lt;/p&gt;&lt;p&gt;Obviously a lot of them don’t like AI, which makes sense. But there are some really interesting stories about companies that jump on the AI bandwagon, say “We’re automating everything,” fire a bunch of their customer service people—and then two months later they’re like, “Oops. Can you come back?”&lt;/p&gt;&lt;p&gt;One reason for that is if you implement AI poorly, you’re going to have poor results. A lot of these companies don’t really understand what they’re doing. They’re paying lip service to the new hype, and the CEO thinks they can cut a bunch of expenses, and then it just doesn’t really work very well.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Brandon&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;A lot of those people haven’t actually played with it.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Dan&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;Exactly. But another reason, which I think is really interesting and very important: a lot of people who call in to customer service centers do not want to talk to a machine. They’re very explicitly trying to figure out, “Are you a machine or not?” and get to a human. That is a real brake on how fast these kinds of things can be adopted—and that’s only one example. The world is very complicated. There are billions of examples for any kind of job.&lt;/p&gt;&lt;p&gt;Even if we hypothesize this thing that’s always on and can do stuff, one: we have to hypothesize everyone has access to it, because that is the direction it’s going. And two: we should recognize that even if that happens, it will take a long time to become something everybody is comfortable with and everyone uses. It will probably take a generation for it to really turn into a thing.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Brandon&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;There’s also a good argument that working at a call center is not a job that anybody wants. It’s not great—it’s a job you have because you need a job. In a world where this technology exists, yes, we’ll have to figure out a way that everybody can live a fulfilling life and eat. But it might actually be nice to not have that job, assuming you’re taken care of in other ways.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Dan&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;Obviously the transition is a big deal—these are real people with real lives, and some actually do love it. But in general, being yelled at in a call center is not the best job.&lt;/p&gt;&lt;p&gt;Where I’m going is: even if we hypothesize all of that, humans still have to decide what matters. And what matters changes all the time—in particular because AI is an input to that. It’s very recursive. AI is changing the world really fast, which changes what matters, which puts more onus on us to update and decide what matters, because AI is going to wait for us to say what it is.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;(00:30:00)&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;That is going to be part of every job, because anything you can frame as a repetitive thing that’s working, you can just have your AI do. But the minute the situation changes—and situations change all the time, and they especially change all the time when it’s not just humans changing things but AI—you’re going to need humans to decide that. I think that’s something very missing from what we talk about when we hypothesize these things.&lt;/p&gt;&lt;p&gt;Back to the ClickUp guy.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Brandon&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;ClickUp guy.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Dan&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;I think it’s really important, whenever you’re looking at some of this stuff on Twitter: I hate when they’re like, “Our business is better than it’s ever been, and we laid off 8,000 people.”&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Brandon&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;Yeah, it’s pretty bad. Just so you can be more profitable. And the other thing I don’t like is when they say, “We’re going to pay people a million dollars if they do great work.” It’s like, okay, but you still have all these people who no longer have jobs. I don’t think it’s very tastefully done.&lt;/p&gt;&lt;p&gt;And I think Jensen said something that was very self-serving—basically, “If your answer to progress is firing people, you’re not a very creative CEO.” Very self-serving because obviously he wants people to use more AI. But I think it’s true. You should be doing more interesting things, not firing people.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Dan&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;So: not tasteful, which should make you a little suspicious. My guess is—and I’ve seen some of the random stuff—I don’t think the company’s doing that well. When companies don’t do well, they lay people off. And it’s also often correlated with being managed poorly and having too much bloat anyway. Like what happened with Square—Jack Dorsey just does that. And I think Meta’s the same. They’re making gigantic investments in AI because that’s the new hot thing they kind of missed, and the Metaverse didn’t work, so now they have a lot of people getting fired.&lt;/p&gt;&lt;p&gt;So yes, AI is involved in all of this stuff, but it’s not this clean thing of everyone doing the same jobs as before but with agents instead. No—the company actually has to totally change strategies. The people it needs and the structure it needs is just totally different, and that’s not the clean narrative people like to tell. It’s much easier to just say, “AI takes jobs.”&lt;/p&gt;&lt;p&gt;It seems definitely true that using these tools changes your workflow a lot. And because it changes your workflow, it changes what’s hard and what’s easy. Especially if you’re a big company that’s been structured in a certain way, there are going to be reorganizations of how work happens and how companies are structured. That seems really clear. And it’s very important that we figure out how to make that transition as good as possible for people. Tweeting about how well you’re doing it while you’re firing people is not that.&lt;/p&gt;&lt;p&gt;I think there are a lot of really interesting, creative ways to handle this. Meta, for example, is now key-logging everyone’s computer activity because they’re like, “Our people are the smartest people—we’ll use their data to train our models, and our models will be smarter.” Interesting take. Maybe it’ll work.&lt;/p&gt;&lt;p&gt;But there’s a really interesting effect of that—I wrote about this about two years ago. When you sign an employment contract, the way we’ve thought about employment for a very long time is, “I’m going to do this job, and you’re going to need me to keep doing it in order for it to keep getting done.” But once you reach a point where I do the job for you once, and then it just works—and then you don’t have to pay me anymore—that changes the whole way we think about employment. And therefore I think it should change how we think about paying certain types of people.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Brandon&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;You should get a pension.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Dan&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;Pension—okay, maybe pensions are back.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Brandon&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;Pensions are back, baby.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Dan&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;One thing that’s really interesting: there’s this thing that launched last week that we’re a part of—the name is escaping me—but it allows publishers to get paid based on their unique contribution to the training corpus. The more generic your stuff is, the less you get paid; the more unique and valuable it is, the more you get paid. Which is really interesting.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Brandon&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;The ironic thing about that is basically: did you use AI—which is trained off of all the stuff that already exists—to make this? It can still make some things that are new, but it’s basically—&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Dan&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;How much just generic default prompting did you do to make this versus actually, you know—did a human actually think about this?&lt;/p&gt;&lt;p&gt;But I think there could be something similar for individuals. I had this idea a couple of years ago about the last job you’ll ever have, where it’s an agency. You generate all the training data in the work you do for the agency, and then it tracks your contribution, and then you just get paid out forever from how much revenue your data generates.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Brandon&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;web3 is back now.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Dan&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;web3 is back. On the blockchain. Anyway, who knows. The problem with that—and this is back to why humans are valuable—is there’s a really high depreciation rate for the value of data. Once it’s out there, it’s very likely to go stale within weeks. All of these companies are just hunting for net-new, unique data.&lt;/p&gt;&lt;p&gt;So: we should expect broad reorganizations of companies, and we should expect companies that are not doing well to lay people off, reorganize, and then blame AI. I would be really skeptical of anyone saying it’s going to eliminate all jobs or all knowledge work. It will certainly change them, and it’s certainly a big thing people have to take seriously.&lt;/p&gt;&lt;p&gt;But my big takeaway—and this is not fully in the piece, but it’s what I really believe—is if you just ride the models, if you just, when new models come out, learn to use them for the stuff that you do, whatever that is, you’re going to be fine. You may even find that you can do more and better work that’s more fulfilling than you could before.&lt;/p&gt;&lt;p&gt;I think there’s still a place in the world if you don’t want to use the models at all—that’s still going to be a thing. Plenty of people don’t, I don’t know, plenty of people don’t eat fast food. It’s totally possible not to participate in this. However, if you care about leading a really ambitious life and building businesses or whatever it is, I truly think this is going to make that more possible for more people. And as long as you ride the models, you’re going to be good.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;(00:40:00)&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Brandon&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;I think that’s a very good call to action. I want to end by asking you something about what it takes to write a piece like this.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Dan&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;A lot of Celsius.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Brandon&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;A lot of Celsius. When we started—I don’t know if this will make it into the podcast—Dan was looking like this. Hugging himself. Protecting himself, some would say. It has been a very stressful week. This is an 8,000-word piece.&lt;/p&gt;&lt;p&gt;Most people are not writers. Can you share what it’s like to not just write an 8,000-word piece, which is a very big piece, but—what does it take to think through these arguments?&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Dan&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;It’s so interesting because it’s very natural to me. I published something once a week for so long that especially for a 500- or 1,000-word piece, I can just bang that out in an hour or two. These things get much harder the longer they go because there are all these interdependencies. If you change something here, it changes four other things over there. So 8,000 words becomes like 10 times harder than 4,000 words, which is 10 times harder than 400.&lt;/p&gt;&lt;p&gt;I always have this feeling that there’s this underlying thing that I can feel but can’t quite say, that I’m trying to say. It started actually during our Q2 planning—I said, “I think I figured out why we’re just going to always have jobs with AI, and if you just ride the models, you’re going to be fine.” I could feel that. Then it was this process of: okay, how does that actually cash out? Why do I think that? Because it’s all kind of in there, but it’s all tangled up.&lt;/p&gt;&lt;p&gt;I wrote probably four or five versions where I’d start making the argument and then think, “Ah, it doesn’t work.” And I’d throw it out and start again. It was a very frustrating process because what I’m trying to do is start with the ground truth—here’s what we see every day, here’s how work happens for us—and then move into this philosophical thing that can’t quite be articulated. I’m trying to articulate something that can’t be articulated.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Brandon&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;Or it’s constantly a moving target.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Dan&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;Yeah. That’s just very hard. I love that kind of thing, but it’s also very hard and can be very frustrating. But AI was a huge part of this. I could not have written this without it.&lt;/p&gt;&lt;p&gt;For example: for a piece like this, you’re trying to articulate it, you can’t quite articulate it, and the only way to do it is to articulate it over and over and over again until it works. And you’ve really got to keep it in your head, especially if you’re doing lots of other stuff. So what I would do in the morning, fresh, right when I got to my desk, is monologue into my computer into a Proof document: “Here’s what the piece is about front to back. Here’s the argument front to back.” I would have a log of that, and every time I would do it, I would have Claude or Codex—I actually use Claude more for this, I think Claude is better for this kind of thinking—ask it, “What am I really trying to say? Help me figure out what I’m trying to say.” And it would say things back, and I would be like, “No, no. Oh—yes, that’s what I’m trying to say.” Over time you build up this record of where it was at each point, and you’re just getting closer and closer.&lt;/p&gt;&lt;p&gt;Then as I was getting deeper into it, once I had 4,000 or 5,000 words, every morning I would have Codex take the latest draft and turn it into a podcast—just someone reading it to me—and then on my way to work I would listen to it. As I’m listening, I’m thinking, “Okay, there’s something that needs to change there. Oh, and then it would get to the end, I’d be like, ”Here’s the thing I need to do next.” That was a really good way to keep the continuity of what I’m writing and where the problems are—in a way where I’m not always reading. It’s really nice to be on a walk, listening, and thinking about it, which would be completely impossible otherwise.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Brandon&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;Alright, one more challenge for you, and then we’re going to have beers. Can you articulate to everybody in one sentence that starts with, “If you ride the models, then…” what this piece is trying to say?&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Dan&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;If you ride the models, you’re going to be okay. You’re going to have a job. You’re going to do great work. And you don’t have to worry.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Brandon&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;Cheers.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Dan&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;Cheers.&lt;/p&gt;</description>
      <author>Dan Shipper / AI &amp; I</author>
      <pubDate>2026-05-27 13:00:00 -0400</pubDate>
      <guid>https://every.to/podcast/transcript-we-automated-everything-with-ai-and-tripled-our-headcount</guid>
      <link>https://every.to/podcast/transcript-we-automated-everything-with-ai-and-tripled-our-headcount</link>
    </item>
    <item>
      <title>How to Use Codex for Knowledge Work: A Power User’s Guide</title>
      <description>&lt;table&gt;&lt;tr&gt;&lt;td&gt;&lt;/td&gt;&lt;td&gt;&lt;table&gt;&lt;tr&gt;&lt;td&gt;by &lt;a href="https://every.to/@katie.parrott12" itemprop="name"&gt;Katie Parrott&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;figure&gt;&lt;img src="https://d24ovhgu8s7341.cloudfront.net/uploads/post/cover/4274/full_page_cover_21e06f6a76accda3-Codex_for_Knowledge_Work.png"&gt;&lt;figcaption&gt;Midjourney/Every illustration.&lt;/figcaption&gt;&lt;/figure&gt;&lt;p&gt;&lt;em&gt;Was this newsletter forwarded to you? &lt;u&gt;&lt;a href="https://every.to/account" rel="noopener noreferrer" target="_blank"&gt;Sign up&lt;/a&gt;&lt;/u&gt; to get it in your inbox.&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;&lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@danshipper" rel="noopener noreferrer" target="_blank"&gt;Dan Shipper&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; is a man possessed by Codex. He calls it his &lt;u&gt;&lt;a href="https://x.com/danshipper/status/2052054077656252512" rel="noopener noreferrer" target="_blank"&gt;daily driver&lt;/a&gt;&lt;/u&gt;, he’s been at inbox zero for 10 days straight (genuinely unlike him), and at a &lt;u&gt;&lt;a href="https://every.to/context-window/ai-work-is-splitting-in-two" rel="noopener noreferrer" target="_blank"&gt;recent Anthropic event&lt;/a&gt;&lt;/u&gt; he spent his time telling the people who build Claude Code that they had to try Codex. He swears he isn’t sponsored by OpenAI. He’s just like this now.&lt;/p&gt;&lt;p&gt;At first glance, Codex looks just like another coding agent. In practice, it’s a workspace where you and AI agents can work side by side across your inbox, documents, data sources, and connected tools. You bring the context, judgment, and review. Codex helps gather inputs, produce artifacts, check work, and turn repeated processes into reusable workflows.&lt;/p&gt;&lt;p&gt;Today we published a &lt;a href="https://every.to/guides/codex-for-knowledge-work" rel="noopener noreferrer" target="_blank"&gt;power user’s guide&lt;/a&gt; to using Codex for knowledge work—even if you’ve never written a line of code. The guide covers: &lt;/p&gt;&lt;ul&gt;&lt;li&gt;The Codex knowledge-work loop: Connect, contextualize, delegate or collaborate, review, and compound&lt;/li&gt;&lt;li&gt;Workspace setup: how to create context files, rules, source folders, workflow documents, and review checklists&lt;/li&gt;&lt;li&gt;The five levels of Codex use: from one-off tasks to multi-source workflows, recurring chores, small tools, and compounding systems&lt;/li&gt;&lt;li&gt;13 workflow templates: inbox review queues, unanswered message sweeps, research briefs, weekly reports, GTM plans, customer support routing, recruiting research, planning agents, and more&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;If you want to know how to use Codex as an operating system for knowledge work, this guide is for you. &lt;/p&gt;&lt;div class="quill-button" data-source="{&amp;quot;id&amp;quot;:&amp;quot;quill-button-1779824869978&amp;quot;,&amp;quot;text&amp;quot;:&amp;quot;Read the guide&amp;quot;,&amp;quot;url&amp;quot;:&amp;quot;https://every.to/guides/codex-for-knowledge-work?source=post_button&amp;quot;}" id="quill-button-1779824869978"&gt;&lt;a href="https://every.to/guides/codex-for-knowledge-work?source=post_button"&gt;Read the guide&lt;/a&gt;&lt;/div&gt;&lt;p&gt;On June 12, Dan and the Every team are hosting a &lt;u&gt;&lt;a href="https://every.to/events/codex-camp-our-power-user-guide" rel="noopener noreferrer" target="_blank"&gt;two-hour camp&lt;/a&gt;&lt;/u&gt; on the Codex workflows we use most, the use cases that changed how we work, and what becomes possible once you start building Codex-native apps. If you’re not a paid subscriber yet, &lt;u&gt;&lt;a href="https://every.to/subscribe" rel="noopener noreferrer" target="_blank"&gt;start your free trial to join&lt;/a&gt;&lt;/u&gt;.&lt;/p&gt;&lt;div class="quill-button" data-source="{&amp;quot;id&amp;quot;:&amp;quot;quill-button-1779824921105&amp;quot;,&amp;quot;text&amp;quot;:&amp;quot;RSVP&amp;quot;,&amp;quot;url&amp;quot;:&amp;quot;https://every.to/events/codex-camp-our-power-user-guide?source=post_button&amp;quot;}" id="quill-button-1779824921105"&gt;&lt;a href="https://every.to/events/codex-camp-our-power-user-guide?source=post_button"&gt;RSVP&lt;/a&gt;&lt;/div&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;strong&gt;&lt;em&gt;&lt;a href="https://every.to/@katie.parrott12" rel="noopener noreferrer" target="_blank"&gt;Katie Parrott&lt;/a&gt;&lt;/em&gt;&lt;/strong&gt; &lt;em&gt;is a staff writer at Every. You can read more of her work in&lt;/em&gt; &lt;em&gt;&lt;a href="https://katieparrott.substack.com/" rel="noopener noreferrer" target="_blank"&gt;her newsletter&lt;/a&gt;. To read more essays like this, subscribe to &lt;u&gt;&lt;a href="https://every.to/subscribe" rel="noopener noreferrer" target="_blank"&gt;Every&lt;/a&gt;&lt;/u&gt;, and follow us on X at &lt;u&gt;&lt;a href="http://twitter.com/every" rel="noopener noreferrer" target="_blank"&gt;@every&lt;/a&gt;&lt;/u&gt; and on &lt;u&gt;&lt;a href="https://www.linkedin.com/company/everyinc/" rel="noopener noreferrer" target="_blank"&gt;LinkedIn&lt;/a&gt;&lt;/u&gt;.&lt;/em&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;We &lt;u&gt;&lt;a href="https://every.to/studio" rel="noopener noreferrer" target="_blank"&gt;build AI tools&lt;/a&gt;&lt;/u&gt; for readers like you. Write brilliantly with &lt;/em&gt;&lt;strong&gt;&lt;em&gt;&lt;u&gt;&lt;a href="https://writewithspiral.com/" rel="noopener noreferrer" target="_blank"&gt;Spiral&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;&lt;/strong&gt;&lt;em&gt;. Organize files automatically with &lt;/em&gt;&lt;strong&gt;&lt;em&gt;&lt;u&gt;&lt;a href="https://makeitsparkle.co/?utm_source=everyfooter" rel="noopener noreferrer" target="_blank"&gt;Sparkle&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;&lt;/strong&gt;&lt;em&gt;. Deliver yourself from email with &lt;/em&gt;&lt;strong&gt;&lt;em&gt;&lt;u&gt;&lt;a href="https://cora.computer/" rel="noopener noreferrer" target="_blank"&gt;Cora&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;&lt;/strong&gt;&lt;em&gt;. Dictate effortlessly with &lt;/em&gt;&lt;strong&gt;&lt;em&gt;&lt;u&gt;&lt;a href="https://monologue.to/" rel="noopener noreferrer" target="_blank"&gt;Monologue&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;&lt;/strong&gt;&lt;em&gt;. Collaborate with agents on documents with &lt;/em&gt;&lt;strong&gt;&lt;em&gt;&lt;a href="https://www.proofeditor.ai/" rel="noopener noreferrer" target="_blank"&gt;Proof&lt;/a&gt;&lt;/em&gt;&lt;/strong&gt;&lt;em&gt;. &lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;We also do AI training, adoption, and innovation for companies. &lt;u&gt;&lt;a href="https://every.to/consulting?utm_source=emailfooter" rel="noopener noreferrer" target="_blank"&gt;Work with us&lt;/a&gt;&lt;/u&gt; to bring AI into your organization.&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;For sponsorship opportunities, reach out to sponsorships@every.to.&lt;/em&gt;&lt;/p&gt;&lt;div class="quill-button" data-source="{&amp;quot;id&amp;quot;:&amp;quot;quill-button-1769187301610&amp;quot;,&amp;quot;url&amp;quot;:&amp;quot;https://every.to/subscribe?source=post_button&amp;quot;,&amp;quot;text&amp;quot;:&amp;quot;Subscribe&amp;quot;}" id="quill-button-1769187301610"&gt;&lt;a href="https://every.to/subscribe?source=post_button"&gt;Subscribe&lt;/a&gt;&lt;/div&gt;</description>
      <author>Katie Parrott</author>
      <pubDate>2026-05-26 12:00:00 -0400</pubDate>
      <guid>https://every.to/p/how-to-use-codex-for-knowledge-work-a-power-user-s-guide</guid>
      <link>https://every.to/p/how-to-use-codex-for-knowledge-work-a-power-user-s-guide</link>
    </item>
    <item>
      <title>Codex for Knowledge Work</title>
      <description>&lt;table&gt;&lt;tr&gt;&lt;td&gt;&lt;img alt="Guides" src="https://d24ovhgu8s7341.cloudfront.net/uploads/publication/logo/107/small_Guides_cover.png" /&gt;&lt;/td&gt;&lt;td&gt;&lt;/td&gt;&lt;td&gt;&lt;table&gt;&lt;tr&gt;&lt;td&gt;by &lt;a href="https://every.to/@katie.parrott12" itemprop="name"&gt;Katie Parrott&lt;/a&gt; and &lt;a href="https://every.to/@chatgpt" itemprop="name"&gt;GPT &lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;in &lt;a href="https://every.to/guides"&gt;Guides&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;p&gt;Codex is easy to underestimate. At first glance it looks like another AI coding tool; if you’re not an engineer, a natural conclusion is that it’s not for you.&lt;/p&gt;&lt;p&gt;That reading misses how much Codex makes possible. &lt;/p&gt;&lt;p&gt;Picture a Monday morning: A request for a launch plan lands in your inbox. You forward it to Codex, which has its own email account, and close your laptop while Codex runs tasks in the cloud, or on a machine like a Mac Mini that you keep active. On your commute to the office, you get an email notification on your phone: Codex has read the relevant Slack threads, pulled customer notes out of Google Drive, checked last quarter’s numbers in PostHog, and started a go-to-market plan in a shared Notion document. It just needs you to confirm one detail about timing, which you do with a thumbs-up. By the time you reach your desk, a draft is waiting for review. &lt;/p&gt;&lt;p&gt;This is a day in the life of an agent-pilled knowledge worker. It all runs on OpenAI’s agent, Codex, in the Codex desktop app. We use “Codex” to refer to the app throughout this guide. &lt;/p&gt;&lt;p&gt;Codex is a workspace for you and your AI agents. Give Codex access to the files, apps, and tools it needs, and it gathers context, moves through the task across every surface it can reach—including your connected apps, the browser, and your computer. That makes it useful not just for code, but for a broad range of knowledge work.&lt;/p&gt;&lt;p&gt;There are two ways to work with agents in Codex: &lt;strong&gt;Delegate&lt;/strong&gt; or &lt;strong&gt;collaborate.&lt;/strong&gt;&lt;/p&gt;&lt;ul&gt;&lt;li&gt;&lt;strong&gt;Delegate&lt;/strong&gt; tasks that are predictable, repeatable, and low-risk. With clear, well-specified instructions, the agent can execute autonomously and bring back finished work for your review.&lt;/li&gt;&lt;li&gt;&lt;strong&gt;Collaborate&lt;/strong&gt; on tasks that are judgment-heavy, exploratory, or iterative. You work alongside the model toward an outcome that matches your vision.&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;AI progress has reached a point where expertise is easy to replicate. Each new model can do more of what used to require rare skill—which creates both more opportunity and more noise. The people who work best in this environment know how to direct AI’s capability without losing their personal judgment. They &lt;u&gt;&lt;a href="https://every.to/p/after-automation" rel="noopener noreferrer" target="_blank"&gt;ride the models&lt;/a&gt;&lt;/u&gt; rather than being overwhelmed by them.&lt;/p&gt;&lt;p&gt;Expert Codex users are one of the clearest examples of what that looks like in practice.&lt;/p&gt;&lt;p&gt;This guide is about becoming one of those people. It covers how to set up a workspace, run high-leverage knowledge-work tasks, and turn repeated work into durable systems that get better over time. If you’re ready to think of your work in terms of systems instead of one-off tasks, this guide is for you.&lt;/p&gt;&lt;p data-guide-block-kind="agent-buttons" data-guide-block-id="guide-block-1779827761591-u9k6gl"&gt;&lt;br&gt;&lt;/p&gt;&lt;h2&gt;&lt;strong&gt;Part 1: Understanding Codex&lt;/strong&gt;&lt;/h2&gt;&lt;h3&gt;What Codex is&lt;/h3&gt;&lt;p&gt;Codex is a tool-using agentic workspace: You give it a goal and it plans the work, uses available tools and context, and produces a result for you to review. It can read and write files on your computer, connect to external services through plugins and other integrations, run multi-step tasks without asking for guidance, generate code and scripts when a task needs them, and maintain context across a persistent workspace.&lt;/p&gt;&lt;p&gt;Specific capabilities that make Codex worth using: &lt;/p&gt;&lt;ul&gt;&lt;li&gt;Works alongside you on multiple tasks in parallel&lt;/li&gt;&lt;li&gt;Pulls context from the apps and files you connect&lt;/li&gt;&lt;li&gt;Uses a supported browser and desktop workflows when a task needs on-screen action&lt;/li&gt;&lt;li&gt;Checks its own work, revises, and keeps going&lt;/li&gt;&lt;li&gt;Holds a persistent goal across a long-running session, instead of treating each message as a one-off request&lt;/li&gt;&lt;li&gt;Turns repeatable tasks into recurring workflows&lt;/li&gt;&lt;li&gt;Helps route shared requests from places like Slack, email, or forms&lt;/li&gt;&lt;li&gt;Lets you start, steer, approve, and review work from your phone while Codex works in the cloud or on a machine, such as a Mac Mini, that you keep awake&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;These capabilities make Codex useful both for delegating well-specified tasks and as a shared workspace for human-agent collaboration. Deciding which mode fits which needs is the meta-skill of modern knowledge work.&lt;strong&gt; &lt;/strong&gt;&lt;/p&gt;&lt;h4 data-guide-block-kind="callout" data-guide-block-id="guide-block-1779817335666-b66wia"&gt;&lt;strong&gt;A note on Goals&lt;/strong&gt;&lt;/h4&gt;&lt;p data-guide-block-kind="callout" data-guide-block-id="guide-block-1779817335666-b66wia"&gt;A Goal in Codex, initiated using the &lt;code&gt;/goal&lt;/code&gt; command, is a persistent objective that shapes an entire session rather than living and dying with a single message. Instead of re-briefing the agent on every turn, you tell it what “done” looks like, how success gets checked, and which constraints to respect. Codex then keeps working toward that outcome across interruptions and session breaks. Goals let you delegate long-horizon work, collaborate without losing the thread, and compound progress over time instead of restarting from scratch. &lt;/p&gt;&lt;p data-guide-block-kind="callout" data-guide-block-id="guide-block-1779817335666-b66wia"&gt;A simple test for when to use &lt;code&gt;/goal&lt;/code&gt;: If you’d type the same sentence into three prompts in a row—“cite every factual claim, match the house style, never send without my review”—make it a goal instead.&lt;/p&gt;&lt;p data-guide-block-kind="callout" data-guide-block-id="guide-block-1779817335666-b66wia"&gt;&lt;strong&gt;Goals versus skills. &lt;/strong&gt;A skill is a reusable set of packaged instructions (sometimes with scripts) that teaches Codex how to handle a recurring kind of task well. A goal, on the other hand, is what you’re trying to accomplish in a given stretch of work. It guides one session until the objective is met, then it’s done.&lt;/p&gt;&lt;h4&gt;&lt;strong&gt;Codex on mobile&lt;/strong&gt;&lt;/h4&gt;&lt;p data-guide-block-kind="callout" data-guide-block-id="guide-block-1780326026241-r7vb9j"&gt;Codex also runs from your phone through the ChatGPT mobile app, remotely controlling the machine where your work is happening. The mobile app suits the lightweight parts of a workflow: You can kick off a task, answer a question, approve an action, or review a draft from anywhere. Heavier review still deserves a real screen.&lt;/p&gt;&lt;h3&gt;What Codex isn’t&lt;/h3&gt;&lt;p&gt;Codex isn’t a magic intern that can safely act without supervision. It isn’t a replacement for taste, judgment, or ownership. It isn’t a replacement for human review or fact-checking. It isn’t useful for tasks where the source data is inaccessible, the criteria for success are entirely subjective, or the stakes of an error are too high to allow autonomous action.&lt;/p&gt;&lt;h4&gt;&lt;strong&gt;Useful rules&lt;/strong&gt;&lt;/h4&gt;&lt;p&gt;A task is a good candidate for Codex if it has at least two of the following traits:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;It requires pulling data from multiple sources.&lt;/li&gt;&lt;li&gt;It involves repeated steps you do regularly.&lt;/li&gt;&lt;li&gt;It can be checked against objective criteria.&lt;/li&gt;&lt;li&gt;It produces a durable artifact—a document, a plan, a report, a script.&lt;/li&gt;&lt;li&gt;It benefits from synthesis across many inputs.&lt;/li&gt;&lt;li&gt;It’s annoying enough that you routinely delay or avoid it.&lt;/li&gt;&lt;/ul&gt;&lt;p data-guide-block-kind="callout" data-guide-block-id="guide-block-1779817287018-p1yi0r"&gt;&lt;strong&gt;Delegate &lt;/strong&gt;tasks&lt;strong&gt; &lt;/strong&gt;when they are: &lt;/p&gt;&lt;ul&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1779817287018-p1yi0r"&gt;Repeatable&lt;/li&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1779817287018-p1yi0r"&gt;Objective&lt;/li&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1779817287018-p1yi0r"&gt;Checkable&lt;/li&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1779817287018-p1yi0r"&gt;Low-risk&lt;/li&gt;&lt;/ul&gt;&lt;p data-guide-block-kind="callout" data-guide-block-id="guide-block-1779817287018-p1yi0r"&gt;&lt;strong&gt;Collaborate &lt;/strong&gt;on tasks that are: &lt;/p&gt;&lt;ul&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1779817287018-p1yi0r"&gt;Ambiguous&lt;/li&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1779817287018-p1yi0r"&gt;Judgment-heavy&lt;/li&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1779817287018-p1yi0r"&gt;Exploratory&lt;/li&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1779817287018-p1yi0r"&gt;Iterative&lt;/li&gt;&lt;/ul&gt;&lt;h4&gt;&lt;strong&gt;Codex, Claude Code, and Claude Cowork&lt;/strong&gt;&lt;/h4&gt;&lt;p&gt;If you’ve used Claude Code, you already have a mental model for an agent that works on your machine. For broader knowledge work, OpenAI and Anthropic have arrived at a similar experience from different directions. &lt;/p&gt;&lt;p&gt;Anthropic packages everything into one Claude app with three modes: Chat, Code, and Cowork. Code began as a terminal tool for developers (Claude Code) and now has a graphical version inside the app—no terminal required. It’s built for code repositories, but with the right connectors it handles a lot of general knowledge work too. Cowork takes the same engine and aims it at non-coding work, with folder access, Chrome browsing, computer use, scheduled tasks, and persistent project memory.&lt;/p&gt;&lt;p&gt;Codex is OpenAI’s counterpart, but rather than split the work across modes, it puts coding and knowledge work in a single workspace. A few things give Codex an edge for knowledge work today:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;&lt;strong&gt;One surface, not two.&lt;/strong&gt; Anthropic splits agentic work between Code and Cowork; Codex handles both in the same place, so you’re never deciding which mode a task belongs in.&lt;/li&gt;&lt;li&gt;&lt;strong&gt;A browser that works beside you.&lt;/strong&gt; Codex renders the pages inside the app itself as a shared view between you and the agent. The Claude app operates a stand-alone Chrome window or your full screen instead. For logged-in sites, both rely on a Chrome extension. In our experience, Codex’s built-in browser tends to be faster, more reliable, and more useful for collaborative work.&lt;/li&gt;&lt;li&gt;&lt;strong&gt;Connectors out of the box. &lt;/strong&gt;Codex comes with a catalog of connectors you authorize in a click; in the Claude app you add tools as MCP servers, which requires a bit more assembly. &lt;/li&gt;&lt;/ul&gt;&lt;p&gt;Which surface is right comes down to model preference and workflow habits; Codex has the edge for us today—but the labs ship fast, and that can change.&lt;/p&gt;&lt;h3&gt;&lt;strong&gt;The Codex knowledge work loop&lt;/strong&gt;&lt;/h3&gt;&lt;p&gt;Every sustainable Codex workflow follows the same five-step pattern:&lt;/p&gt;&lt;h3 data-guide-block-kind="callout" data-guide-block-id="guide-block-1779817436733-2mrhhk"&gt;Connect → Contextualize → Delegate/collaborate → Review → Compound&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Connect: &lt;/strong&gt;Give Codex access to the systems you use for work—Gmail, Slack, Notion, Google Drive, your calendar, your analytics tools, your support platform, and/or local files. Without connected apps or source access, Codex is limited to the local/project files it can access, uploaded or linked materials, and context you provide in the thread. With connections, it can find what it needs on its own.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Contextualize:&lt;/strong&gt; Put your goals, preferences, project details, source links, review standards, and standing rules in files Codex can access, then cite those files in Codex’s &lt;u&gt;&lt;a href="http://agents.md" rel="noopener noreferrer" target="_blank"&gt;AGENTS.md&lt;/a&gt;&lt;/u&gt; file to make them readily available. This is the difference between an agent that has to be re-briefed every time and one that already understands who you are, what you’re working on, and how you like to work.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Delegate/collaborate:&lt;/strong&gt; Decide whether the task needs close collaboration or can run on its own. Either way, specify inputs, output format, and acceptance criteria, then let it work.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Review:&lt;/strong&gt; Check the output in the destination app. If Codex drafted Slack messages, review them in Slack. If it wrote a strategy document, review it in your word processor of choice, such as Google Docs, Notion, or &lt;strong&gt;&lt;u&gt;&lt;a href="https://proofeditor.ai/" rel="noopener noreferrer" target="_blank"&gt;Proof&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;. Content that looks fine in a terminal or the Codex app may read differently in the space where it will ultimately be used.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Compound:&lt;/strong&gt; Turn what works into something reusable. Save the prompt. Document the workflow. Add mistakes to your review checklist and keep your context files up to date. Each session should make future sessions faster.&lt;/p&gt;&lt;h2&gt;&lt;strong&gt;Part 2: Setup&lt;/strong&gt;&lt;/h2&gt;&lt;h3&gt;&lt;strong&gt;Connect your systems&lt;/strong&gt;&lt;/h3&gt;&lt;p&gt;Connect the tools you want Codex to have access to. This includes Gmail, Slack, Notion, Google Drive, your calendar, analytics tools, support platforms, or anything else for which Codex has an integration. Once the relevant tools are connected, Codex can look at your actual work context and suggest workflows based on your messages, files, meetings, and recurring tasks.&lt;/p&gt;&lt;p&gt;Connecting a tool isn’t the same thing as letting Codex act on it. Across everything you connect, Codex can read and draft while still asking for your approval before it sends, posts, archives, or deletes. That makes broad access low-risk early on: Connect generously so Codex can find workflows worth building. Then, once you know which ones you’ll keep, disconnect the tools you don’t need to reduce risk and limit unnecessary data exposure.&lt;/p&gt;&lt;h4 data-guide-block-kind="callout" data-guide-block-id="guide-block-1779818908797-042bpz"&gt;&lt;strong&gt;Three ways Codex reaches your tools&lt;/strong&gt;&lt;/h4&gt;&lt;p data-guide-block-kind="callout" data-guide-block-id="guide-block-1779818908797-042bpz"&gt;Codex can touch the same tool in more than one way, and knowing which access path is which saves a lot of confusion:&lt;/p&gt;&lt;ul&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1779818908797-042bpz"&gt;&lt;strong&gt;Connectors (plugins)&lt;/strong&gt; give Codex structured, API-level access to an app—Gmail, Slack, Notion, your analytics tools. This is the most reliable and repeatable option, so use it whenever a connector exists.&lt;/li&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1779818908797-042bpz"&gt;&lt;strong&gt;Browser use&lt;/strong&gt; lets Codex operate a web page directly through its in-app browser—useful for local previews, public pages, and anything you want to watch it do on a shared screen. For sites that require you to be signed in, like your email client, the Codex Chrome extension works inside your logged-in browser.&lt;/li&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1779818908797-042bpz"&gt;&lt;strong&gt;Computer use&lt;/strong&gt; lets Codex see and operate your desktop the way a person would—clicking through an app, changing a setting, or working with software that only exists as a graphical interface.&lt;/li&gt;&lt;/ul&gt;&lt;p data-guide-block-kind="callout" data-guide-block-id="guide-block-1779818908797-042bpz"&gt;The rule of thumb: Reach for a connector first, the browser next, and computer use when nothing else can get to the task.&lt;/p&gt;&lt;p&gt;Starting prompt—use this once your integrations are set up:&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779817720669-ddieaq"&gt;Connect to the tools I use for work: [List your tools—Gmail, Slack, Notion, Drive, etc.]. Then look at my work patterns across those tools and suggest three workflows I should set up first. For each one, describe the input sources, the output artifact, how often it should run, what approval looks like, and what would make the workflow worth keeping long-term.&lt;/p&gt;&lt;p&gt;Once the relevant tools are connected and permissioned, this prompt lets Codex inspect the available work context and suggest automation candidates rather than forcing you to invent them.&lt;/p&gt;&lt;h3&gt;&lt;strong&gt;Build your Codex workspace&lt;/strong&gt;&lt;/h3&gt;&lt;p&gt;Build Codex’s workspace before running any workflows. Skip this step and you’ll likely stall. &lt;/p&gt;&lt;p&gt;A Codex workspace is a folder—local on your machine, synced to GitHub if you want version control—that contains the context files, workflow instructions, and review standards Codex reads at the start of each session. Think of it as an onboarding manual the agent reads at the start of each session.&lt;/p&gt;&lt;h4&gt;&lt;strong&gt;An example workspace structure&lt;/strong&gt;&lt;/h4&gt;&lt;p data-guide-block-kind="terminal" data-guide-block-id="guide-block-1779817905236-7c2762"&gt;your-workspace/&lt;/p&gt;&lt;p data-guide-block-id="guide-block-1779817905236-7c2762" data-guide-block-kind="terminal"&gt;├── README.md                  # Start here—orientation&lt;/p&gt;&lt;p data-guide-block-kind="terminal" data-guide-block-id="guide-block-1779817905236-7c2762"&gt;├── identity/                  # About you&lt;/p&gt;&lt;p data-guide-block-kind="terminal" data-guide-block-id="guide-block-1779817905236-7c2762"&gt;│   ├── context.md&lt;/p&gt;&lt;p data-guide-block-kind="terminal" data-guide-block-id="guide-block-1779817905236-7c2762"&gt;│   ├── preferences.md&lt;/p&gt;&lt;p data-guide-block-kind="terminal" data-guide-block-id="guide-block-1779817905236-7c2762"&gt;│   └── rules.md&lt;/p&gt;&lt;p data-guide-block-kind="terminal" data-guide-block-id="guide-block-1779817905236-7c2762"&gt;├── playbooks/                 # Process—repeatable workflows&lt;/p&gt;&lt;p data-guide-block-kind="terminal" data-guide-block-id="guide-block-1779817905236-7c2762"&gt;│   ├── workflows/&lt;/p&gt;&lt;p data-guide-block-kind="terminal" data-guide-block-id="guide-block-1779817905236-7c2762"&gt;│   ├── inbox-sweep.md&lt;/p&gt;&lt;p data-guide-block-kind="terminal" data-guide-block-id="guide-block-1779817905236-7c2762"&gt;│   └── research-brief.md&lt;/p&gt;&lt;p data-guide-block-kind="terminal" data-guide-block-id="guide-block-1779817905236-7c2762"&gt;├── sources/                   # Source shelf—inputs&lt;/p&gt;&lt;p data-guide-block-kind="terminal" data-guide-block-id="guide-block-1779817905236-7c2762"&gt;│   ├── sources/&lt;/p&gt;&lt;p data-guide-block-kind="terminal" data-guide-block-id="guide-block-1779817905236-7c2762"&gt;│   ├── key-links.md&lt;/p&gt;&lt;p data-guide-block-kind="terminal" data-guide-block-id="guide-block-1779817905236-7c2762"&gt;│   └── recurring-docs.md&lt;/p&gt;&lt;p data-guide-block-kind="terminal" data-guide-block-id="guide-block-1779817905236-7c2762"&gt;├── outputs/                   # Finished work&lt;/p&gt;&lt;p data-guide-block-kind="terminal" data-guide-block-id="guide-block-1779817905236-7c2762"&gt;│   ├── outputs/&lt;/p&gt;&lt;p data-guide-block-kind="terminal" data-guide-block-id="guide-block-1779817905236-7c2762"&gt;│   ├── drafts/&lt;/p&gt;&lt;p data-guide-block-kind="terminal" data-guide-block-id="guide-block-1779817905236-7c2762"&gt;│   └── reports/&lt;/p&gt;&lt;p data-guide-block-kind="terminal" data-guide-block-id="guide-block-1779817905236-7c2762"&gt;└── reviews/                   # Quality checks—guardrails&lt;/p&gt;&lt;p data-guide-block-kind="terminal" data-guide-block-id="guide-block-1779817905236-7c2762"&gt;    ├── data-checklist.md&lt;/p&gt;&lt;p data-guide-block-kind="terminal" data-guide-block-id="guide-block-1779817905236-7c2762"&gt;    └── writing-checklist.md&lt;/p&gt;&lt;p&gt;What you’re doing here has a name: context engineering—a term popularized by Shopify CEO &lt;strong&gt;&lt;u&gt;&lt;a href="https://x.com/tobi/status/1935533422589399127" rel="noopener noreferrer" target="_blank"&gt;Toby Lütke&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; and prominent AI engineer &lt;strong&gt;&lt;u&gt;&lt;a href="https://x.com/karpathy/status/1937902205765607626?lang=en" rel="noopener noreferrer" target="_blank"&gt;Andrej Karpathy&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;. Getting the right context to the model at the right time accounts for at least half of its performance. &lt;/p&gt;&lt;p&gt;At the start of each session, Codex looks at &lt;code&gt;AGENTS.md&lt;/code&gt;, which works as the table of contents. You can write your standing instructions directly in it, but we recommend keeping &lt;code&gt;AGENTS.md&lt;/code&gt; short and pointing it at more detailed files: &lt;code&gt;context.md&lt;/code&gt; for who you are and what you’re working on, &lt;code&gt;preferences.md&lt;/code&gt; for how you want the work done, and &lt;code&gt;rules.md&lt;/code&gt; for what it may and may not do without asking.&lt;/p&gt;&lt;h4 data-guide-block-kind="callout" data-guide-block-id="guide-block-1779818686951-lpvg9i"&gt;&lt;strong&gt;What to put in your context files&lt;/strong&gt;&lt;/h4&gt;&lt;p data-guide-block-kind="callout" data-guide-block-id="guide-block-1779818686951-lpvg9i"&gt;&lt;code&gt;context.md&lt;/code&gt; should cover:&lt;/p&gt;&lt;ul&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1779818686951-lpvg9i"&gt;Your role and the function you own&lt;/li&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1779818686951-lpvg9i"&gt;Active projects and their current status&lt;/li&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1779818686951-lpvg9i"&gt;The tools you use daily and what each one is for&lt;/li&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1779818686951-lpvg9i"&gt;The people or teams you work with most closely&lt;/li&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1779818686951-lpvg9i"&gt;How decisions typically get made in your context&lt;/li&gt;&lt;/ul&gt;&lt;p data-guide-block-kind="callout" data-guide-block-id="guide-block-1779818686951-lpvg9i"&gt;&lt;code&gt;preferences.md&lt;/code&gt; should cover:&lt;/p&gt;&lt;ul&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1779818686951-lpvg9i"&gt;Writing style and tone (formal or conversational, terse or thorough)&lt;/li&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1779818686951-lpvg9i"&gt;Communication preferences (what you like to review before it goes out and what can be drafted and queued without your involvement)&lt;/li&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1779818686951-lpvg9i"&gt;Decision-making preferences (when to ask before acting and when to proceed and report back)&lt;/li&gt;&lt;/ul&gt;&lt;p data-guide-block-kind="callout" data-guide-block-id="guide-block-1779818686951-lpvg9i"&gt;&lt;code&gt;rules.md&lt;/code&gt; should cover:&lt;/p&gt;&lt;ul&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1779818686951-lpvg9i"&gt;What Codex may never do without explicit approval: Send, post, archive, delete, modify a source of truth, or move money&lt;/li&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1779818686951-lpvg9i"&gt;What Codex may do without asking: Draft, summarize, research, outline, organize&lt;/li&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1779818686951-lpvg9i"&gt;Any standing constraints specific to your work (e.g., client confidentiality rules, brand standards, data handling requirements)&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;Starting prompt—use this to have Codex create your workspace structure:&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779818637346-s57oa8"&gt;[First: Create a folder on your desktop called “Codex”] &lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779818637346-s57oa8"&gt;Set up this folder as a simple Codex workspace for knowledge work.&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779818637346-s57oa8"&gt;Create three starter files:&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779818637346-s57oa8"&gt;    1. context.md—who I am, what I’m working on, what tools I use, and who I work with&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779818637346-s57oa8"&gt;    2. preferences.md—how I like work to be written, reviewed, and handled&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779818637346-s57oa8"&gt;    3. rules.md—what you may do without asking, what you must ask before doing, and what you must never do&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779818637346-s57oa8"&gt;Interview me one question at a time to gather the information you need to fill in each file.&lt;/p&gt;&lt;h4&gt;&lt;strong&gt;The “one pinned chat per project” rule &lt;/strong&gt;&lt;/h4&gt;&lt;p&gt;The workspace folder is for your context; pinned chats are for your work. You can find the option to pin a chat next to the chat name in the app’s lefthand navigation bar. A useful habit from day one is to keep one persistent, pinned thread per project or area of responsibility—one for the product launch, one for weekly reporting, one for recruiting—rather than spinning up a fresh chat for every request. A standing thread accumulates context as you go, so Codex remembers what you have already established and you don’t have to re-explain the project each time. A pinned chat with a goal and the thread itself turns Codex into a reliable home for that stream of work.&lt;/p&gt;&lt;h2&gt;&lt;strong&gt;Part 3: The five levels of Codex use&lt;/strong&gt;&lt;/h2&gt;&lt;p&gt;Codex power users don’t arrive there all at once. They get there in stages, and each stage calls for a different way of thinking about what Codex is doing and what it’s good for. Skip ahead too quickly, and you’ll get frustrated —either you don’t trust it yet, or you haven’t built the infrastructure for more autonomous work. At every level, you should know when to hand work to Codex and when to stay in the loop as its collaborator.&lt;/p&gt;&lt;h3&gt;&lt;strong&gt;Level 1: One-off knowledge work&lt;/strong&gt;&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Mental model:&lt;/strong&gt; Codex as a capable, thorough research and drafting assistant.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Mode: &lt;/strong&gt;Collaborate.&lt;strong&gt; &lt;/strong&gt;At this level, nothing is automated. You run single-session tasks, review everything before it leaves your hands, and build familiarity with how Codex handles different types of work.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Best first tasks:&lt;/strong&gt;&lt;/p&gt;&lt;ul&gt;&lt;li&gt;Summarize a meeting transcript and extract decisions, open questions, and follow-up actions.&lt;/li&gt;&lt;li&gt;Turn scattered notes into a structured outline.&lt;/li&gt;&lt;li&gt;Build a research brief from a set of links and documents.&lt;/li&gt;&lt;li&gt;Rewrite a draft against a style guide.&lt;/li&gt;&lt;li&gt;Create a review checklist for a document, launch plan, or strategy memo.&lt;/li&gt;&lt;li&gt;Convert a written draft into an audio file for editing on the go.&lt;/li&gt;&lt;/ul&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779818837556-dazs6l"&gt;Use the attached [documents/links/notes] to produce [specific artifact]. Prioritize accuracy over elegance. Include source links for any factual claims. Flag anything uncertain or that requires my verification. End with the three questions I should answer before this artifact is ready to use.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Review habit:&lt;/strong&gt; Before polishing any output, ask Codex to list the assumptions it made and where it is least confident. This surfaces problems before you invest time in refinement.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Move to Level 2 when:&lt;/strong&gt; You keep wishing Codex remembered what you told it last time.&lt;/p&gt;&lt;h3&gt;&lt;strong&gt;Level 2: Multi-source workflows&lt;/strong&gt;&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Mental model:&lt;/strong&gt; Codex as a cross-system analyst that can assemble information you could never pull together manually in a reasonable amount of time.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Mode: &lt;/strong&gt;Collaborate. At this level, Codex can synthesize outputs from multiple connected systems—Slack threads, Notion pages, email archives, analytics dashboards, and Google Drive documents—but it still needs guidance and feedback.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Example multi-source tasks:&lt;/strong&gt;&lt;/p&gt;&lt;ul&gt;&lt;li&gt;A go-to-market plan built from internal meeting transcripts, Slack discussions, customer notes, and a strategy template&lt;/li&gt;&lt;li&gt;A weekly KPI report from analytics, revenue data, support volume, and social metrics&lt;/li&gt;&lt;li&gt;A summary synthesized from Slack, Notion, Drive links, and past drafts&lt;/li&gt;&lt;li&gt;A weekly leadership brief assembled from team standups, metrics, and open decisions&lt;/li&gt;&lt;/ul&gt;&lt;p data-guide-block-label="How to delegate a multi-source task" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779819023849-bi5kjp"&gt;I need [specific artifact].&lt;/p&gt;&lt;p data-guide-block-label="How to delegate a multi-source task" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779819023849-bi5kjp"&gt;Sources to use:&lt;/p&gt;&lt;p data-guide-block-label="How to delegate a multi-source task" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779819023849-bi5kjp"&gt; - [Tool 1]: [what to look for there]&lt;/p&gt;&lt;p data-guide-block-label="How to delegate a multi-source task" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779819023849-bi5kjp"&gt; - [Tool 2]: [what to look for there]&lt;/p&gt;&lt;p data-guide-block-label="How to delegate a multi-source task" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779819023849-bi5kjp"&gt; - [Tool 3]: [what to look for there]&lt;/p&gt;&lt;p data-guide-block-label="How to delegate a multi-source task" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779819023849-bi5kjp"&gt;Output format: [describe the structure you want]&lt;/p&gt;&lt;p data-guide-block-label="How to delegate a multi-source task" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779819023849-bi5kjp"&gt;Before you start, give me a short plan: Identify the sources you will inspect, the artifact you will produce, any gaps or unknowns you anticipate, and the checks you will run before calling it done. If anything requires sending, posting, archiving, or modifying a source of truth, ask first.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;A warning about data:&lt;/strong&gt; A one-shot attempt at pulling data from multiple systems can be wrong because of stale data, mismatched definitions, permissions gaps, or join errors. For any metric that informs business decisions or agent actions, verify column by column against your primary source. The closer a number is to a source of truth, the more carefully it needs to be checked.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Make your outputs agent-readable:&lt;/strong&gt; Plans and reports you generate in Codex will be read by other people—but also, increasingly, by their agents. Write them in plain, structured language that a human can scan and an agent can query. Clear section headers, explicit decisions, and labeled action items make the artifact useful in both directions.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Move to Level 3 when:&lt;/strong&gt; You keep running the same multi-source workflow more than once a week and wishing it happened automatically.&lt;/p&gt;&lt;h3&gt;&lt;strong&gt;Level 3: Repeated chores into persistent workflows &lt;/strong&gt;&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Mental model:&lt;/strong&gt; Codex as an automated operations layer that handles predictable, recurring work so you don’t have to.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Mode: &lt;/strong&gt;Hybrid. Some tasks are fully predictable and can run without back-and-forth. These tasks are ripe for &lt;strong&gt;delegation&lt;/strong&gt;. Tasks that involve judgment, strategy, or creative decisions suit &lt;strong&gt;collaboration.&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;A useful heuristic: If you could write a checklist that covers 90 percent of the cases, delegate it. If you would need to think about it differently each time, collaborate.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;In either case, look for “computer chores”&lt;/strong&gt;—recurring tasks that take time and attention, but don’t require human judgment at every single touchpoint.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Common chore candidates:&lt;/strong&gt;&lt;/p&gt;&lt;ul&gt;&lt;li&gt;End-of-day check for unanswered Slack messages and emails, with drafted replies&lt;/li&gt;&lt;li&gt;Weekly metrics brief from analytics, revenue, and support data&lt;/li&gt;&lt;li&gt;Meeting-note cleanup and action-item extraction after each recorded call&lt;/li&gt;&lt;li&gt;Customer support pattern detection and issue routing&lt;/li&gt;&lt;li&gt;Draft-to-review package that formats a piece for editor handoff&lt;/li&gt;&lt;li&gt;Recruiting research for an open role&lt;/li&gt;&lt;/ul&gt;&lt;p data-guide-block-label="Workflow template" data-guide-block-kind="template" data-guide-block-id="guide-block-1779819137935-8no8us"&gt;Before building any persistent workflow, fill out this template. It becomes the instruction file Codex reads every time the workflow runs. (The workflows in Part 4 are each an example of this canvas applied.)&lt;/p&gt;&lt;p data-guide-block-label="Workflow template" data-guide-block-kind="template" data-guide-block-id="guide-block-1779819137935-8no8us"&gt;Workflow name:&lt;/p&gt;&lt;p data-guide-block-label="Workflow template" data-guide-block-kind="template" data-guide-block-id="guide-block-1779819137935-8no8us"&gt;Trigger or cadence:&lt;/p&gt;&lt;p data-guide-block-label="Workflow template" data-guide-block-kind="template" data-guide-block-id="guide-block-1779819137935-8no8us"&gt;Input sources:&lt;/p&gt;&lt;p data-guide-block-label="Workflow template" data-guide-block-kind="template" data-guide-block-id="guide-block-1779819137935-8no8us"&gt;Output artifact:&lt;/p&gt;&lt;p data-guide-block-label="Workflow template" data-guide-block-kind="template" data-guide-block-id="guide-block-1779819137935-8no8us"&gt;Approval rules:&lt;/p&gt;&lt;p data-guide-block-label="Workflow template" data-guide-block-kind="template" data-guide-block-id="guide-block-1779819137935-8no8us"&gt;What Codex may do without asking:&lt;/p&gt;&lt;p data-guide-block-label="Workflow template" data-guide-block-kind="template" data-guide-block-id="guide-block-1779819137935-8no8us"&gt;What Codex must ask before doing:&lt;/p&gt;&lt;p data-guide-block-label="Workflow template" data-guide-block-kind="template" data-guide-block-id="guide-block-1779819137935-8no8us"&gt;Verification steps:&lt;/p&gt;&lt;p data-guide-block-label="Workflow template" data-guide-block-kind="template" data-guide-block-id="guide-block-1779819137935-8no8us"&gt;Where the final output lives:&lt;/p&gt;&lt;p data-guide-block-label="Workflow template" data-guide-block-kind="template" data-guide-block-id="guide-block-1779819137935-8no8us"&gt;When to retire or revise this workflow:&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Review discipline for automated workflows:&lt;/strong&gt; Don’t review automated output inside Codex. Draft in Codex, then review in the destination app—Slack for Slack messages, Gmail for email drafts, word processors for documents. Content that looks fine in a terminal often reads differently in the tool where it’s ultimately used, and the context switch catches things a Codex review pass would miss.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Move to Level 4 when:&lt;/strong&gt; Your prompt-based workflow hits a ceiling—the task is too complex or too custom to handle in text alone, and a small script or local tool would make it reliable.&lt;/p&gt;&lt;h3&gt;&lt;strong&gt;Level 4: Build small tools when prompts are not enough&lt;/strong&gt;&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Mental model:&lt;/strong&gt; Codex as a builder that creates lightweight infrastructure to make your workflows more reliable, faster, or more repeatable.&lt;/p&gt;&lt;p&gt;Sometimes the best Codex output is a small script, a local app, a custom dashboard, or a review surface that makes a recurring workflow easier, rather than pure text.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Mode: &lt;/strong&gt;Hybrid. In some cases, Codex may generate an artifact independently for you to review and then move on. In others, the artifact it produces may become a space where you and the agent iterate together.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Examples of when a small tool helps:&lt;/strong&gt;&lt;/p&gt;&lt;ul&gt;&lt;li&gt;A recurring workflow that requires pulling from an API that has no Codex integration. A short script handles the connection reliably.&lt;/li&gt;&lt;li&gt;A review process where you need to see formatted output side by side with the source. A simple local app gives you the view.&lt;/li&gt;&lt;li&gt;A task that needs to run on a schedule without your involvement. A script set to run on a timer (a cron job) handles the timing.&lt;/li&gt;&lt;li&gt;A workflow that accumulates structured data over time. A lightweight database or structured file tracks it persistently.&lt;/li&gt;&lt;/ul&gt;&lt;p data-guide-block-kind="callout" data-guide-block-id="guide-block-1779820329598-tdqexj"&gt;&lt;strong&gt;Practical approach for non-engineers:&lt;/strong&gt;&lt;/p&gt;&lt;ol&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1779820329598-tdqexj"&gt;Run the task manually in Codex once to confirm the output is what you want&lt;/li&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1779820329598-tdqexj"&gt;Ask Codex: “Which steps in this workflow could be made more reliable with a small script or tool?”&lt;/li&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1779820329598-tdqexj"&gt;Have Codex prototype the tool and explain what it does in plain language&lt;/li&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1779820329598-tdqexj"&gt;Run it on your data and verify the output matches what the manual process produced&lt;/li&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1779820329598-tdqexj"&gt;Keep only the parts that reduce friction. Discard what adds complexity without benefit.&lt;/li&gt;&lt;/ol&gt;&lt;p&gt;You don’t need to understand every line of code to use a tool Codex built. You do need to understand what data it touches, what it produces, and where the review step is. If you can’t explain those three things, the tool isn’t ready to run autonomously.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Move to level 5 when: &lt;/strong&gt;You give Codex the same feedback repeatedly and have standing preferences that you’d prefer it to apply on its own.&lt;/p&gt;&lt;h3&gt;&lt;strong&gt;Level 5: Compound your Codex system&lt;/strong&gt;&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Mental model:&lt;/strong&gt; Codex as a system that can improve over time when you save useful workflows, maintain review rules, and use memories or skills to codify preferences where available.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Mode: &lt;/strong&gt;Hybrid. Some instructions will dictate how the agent approaches autonomous work; others will guide how the model interacts with you in collaboration mode.&lt;/p&gt;&lt;p&gt;The idea of “compounding” work comes from &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/guides/compound-engineering" rel="noopener noreferrer" target="_blank"&gt;compound engineering&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;, the AI-native coding methodology coined by &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@kieran_1355" rel="noopener noreferrer" target="_blank"&gt;Kieran Klaassen&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; and &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@nityesh" rel="noopener noreferrer" target="_blank"&gt;Nityesh Agarwal&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; while building &lt;strong&gt;&lt;u&gt;&lt;a href="http://cora.computer" rel="noopener noreferrer" target="_blank"&gt;Cora&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;, Every’s email client. The canonical example is a product requirements document (PRD) that writes the scaffolding for the next one: The artifact you produce becomes the tool that speeds up the next round. The four habits below are how you put it into practice as a knowledge worker, not just an engineer.&lt;/p&gt;&lt;p&gt;Remember: &lt;strong&gt;Each useful session should make future sessions faster and more reliable. &lt;/strong&gt;In practice, that requires doing four things consistently after completing any significant piece of work:&lt;/p&gt;&lt;p&gt;&lt;strong&gt;1. Save successful prompts as workflow files.&lt;/strong&gt; When a prompt produces exactly the right output, document it. Write down the input sources, the exact prompt, the output format, and the review step. Save it in your workflows/ folder. The next time you need the same output, the agent will have that reference to work from.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;2. Add mistakes to review checklists.&lt;/strong&gt; When Codex gets something wrong—a number that was off, a tone that missed the mark, or an assumption it should not have made—add a specific check to your relevant review file, and instruct Codex to check its work against those guardrails.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;3. Update your context files after major projects.&lt;/strong&gt; When a project ends, update context.md to reflect what changed—new priorities, new tools, what worked, and what didn’t. Codex can use this when you point it to the file, turn it into a skill/workflow, or store the pattern in Codex memory where available.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;4. Ask Codex to identify compounding opportunities.&lt;/strong&gt; At the end of any session where you did something useful, run this prompt:&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820367058-c271lt"&gt;Based on what we just did, what parts of this workflow should become a reusable skill, an automation, or a small tool? What context should I add to my project files so we don’t have to re-establish this next time?&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Forking for your discipline:&lt;/strong&gt; The &lt;u&gt;&lt;a href="https://github.com/EveryInc/compound-engineering-plugin" rel="noopener noreferrer" target="_blank"&gt;compound engineering plugin&lt;/a&gt;&lt;/u&gt;, Every’s open-source system for structured agent workflows, installable in Codex with one command, works for knowledge work out of the box, but its review agents are optimized for coding needs like establishing frontend patterns and reviewing for code performance.&lt;/p&gt;&lt;p&gt;Knowledge workers can fork it into a version with reviewers tuned for strategic alignment, data accuracy, writing quality, and communication standards. A forked version, &lt;u&gt;&lt;a href="https://github.com/EveryInc/compound-knowledge-plugin" rel="noopener noreferrer" target="_blank"&gt;compound knowledge&lt;/a&gt;&lt;/u&gt;, is publicly available on Every’s GitHub, and is designed to be readable and editable by non-engineers.&lt;/p&gt;&lt;h2&gt;&lt;strong&gt;Part 4: Workflow library&lt;/strong&gt;&lt;/h2&gt;&lt;p&gt;These workflows are meant as inspiration to get you started. Adapt the inputs, outputs, and approval rules to your specific tools and standards.&lt;/p&gt;&lt;h3&gt;&lt;strong&gt;1. Inbox zero review queue&lt;/strong&gt;&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Best for:&lt;/strong&gt; Anyone whose email backlog is a recurring source of anxiety or dropped balls.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Input sources:&lt;/strong&gt; Gmail or your email client of choice.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Output artifact:&lt;/strong&gt; A structured list of draft replies, proposed actions (archive, delegate, flag), and any emails flagged for your personal attention because the draft alone isn’t sufficient.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@danshipper" rel="noopener noreferrer" target="_blank"&gt;Dan Shipper&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; kept inbox zero for 10 days straight with Codex. To use this workflow, have Codex: &lt;/p&gt;&lt;ol&gt;&lt;li&gt;Gather email through &lt;u&gt;&lt;a href="https://cora.computer/" rel="noopener noreferrer" target="_blank"&gt;Cora&lt;/a&gt;&lt;/u&gt; running in the in-app browser.&lt;/li&gt;&lt;li&gt;Render the email queue as a single page.&lt;/li&gt;&lt;li&gt;Go through each item with you as you dictate the action the AI should take (e.g., “research this,” “draft that,” “pull the documents our lawyers asked for.”) You can do this via chat or voice with a dictation tool like &lt;strong&gt;&lt;u&gt;&lt;a href="https://www.monologue.to/" rel="noopener noreferrer" target="_blank"&gt;Monologue&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; (we recommend the latter). &lt;/li&gt;&lt;/ol&gt;&lt;p&gt;&lt;strong&gt;First prompt:&lt;/strong&gt;&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820476286-abtpfb"&gt;	Go through my inbox for the past [time period].&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820476286-abtpfb"&gt;	For each email that needs a response or action:&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820476286-abtpfb"&gt;1. Categorize it: needs reply/needs action/can archive/already handled&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820476286-abtpfb"&gt;2. If it needs a reply, draft one in my voice using the style in &lt;code&gt;preferences.md&lt;/code&gt;&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820476286-abtpfb"&gt;3. If it needs action, describe the action clearly&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820476286-abtpfb"&gt;4. Flag any email where a draft reply isn’t enough—where I need to think about this personally before responding&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820476286-abtpfb"&gt;	Don’t send anything. Create drafts only. I will review in Gmail.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Review step:&lt;/strong&gt; Review all drafts in Gmail before sending. Don’t approve from inside Codex.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;How to compound:&lt;/strong&gt; After a few sessions, add a rule file describing your categorization preferences—which senders always get priority, which topics can be archived without reply, and which types of requests need a human-written response. &lt;/p&gt;&lt;h3&gt;&lt;strong&gt;2. Daily unanswered message roundup&lt;/strong&gt;&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Best for:&lt;/strong&gt; Anyone who communicates across Slack, email, and other channels and loses track of what still needs a response.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Input sources:&lt;/strong&gt; Slack, Gmail, any other communication tool you use.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Output artifact:&lt;/strong&gt; A list of unanswered items with drafted replies or proposed reactions, organized by urgency.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;First prompt:&lt;/strong&gt;&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820496080-1lm39w"&gt;Look across my Slack and Gmail for the past 24 hours. Find everything that was directed at me that I have not responded to.&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820496080-1lm39w"&gt;	For each item:&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820496080-1lm39w"&gt;1. Draft a reply or suggest a reaction (thumbs up, etc.) if a short acknowledgment is appropriate&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820496080-1lm39w"&gt;2. Flag items where a more considered response is needed3. Flag anything time-sensitive&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820496080-1lm39w"&gt;	Present the list organized by urgency. Don’t send anything.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Review step:&lt;/strong&gt; Review in Slack and Gmail.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;How to compound:&lt;/strong&gt; After a few runs, save a rules file specifying which Slack channels are high-priority, which senders always warrant a human response, and which types of messages can be handled with a reaction rather than a reply.&lt;/p&gt;&lt;h3&gt;&lt;strong&gt;3. Research brief creation&lt;/strong&gt;&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Best for:&lt;/strong&gt; Anyone preparing for a meeting, a pitch, a content piece, or a strategic decision and needing a thorough, sourced summary of a topic.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Input sources:&lt;/strong&gt; Provided links, Notion, Drive, web search.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Output artifact:&lt;/strong&gt; A structured brief with background, key facts, open questions, and source links.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;First prompt:&lt;/strong&gt;&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820511227-dhvhjc"&gt;	Build a research brief on [topic]. &lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820511227-dhvhjc"&gt;	Sources to prioritize: [List any specific links, documents, or databases].&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820511227-dhvhjc"&gt;	Structure the brief as:&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820511227-dhvhjc"&gt;	- Background: what I need to know to have a smart conversation about this&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820511227-dhvhjc"&gt;	- Key facts and data points, each with a source link&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820511227-dhvhjc"&gt;	- Competing perspectives or significant disagreements in the field&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820511227-dhvhjc"&gt;	- Open questions I should be able to answer before [meeting/decision/deadline]&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820511227-dhvhjc"&gt;	- Three things I should read next if I want to go deeper&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820511227-dhvhjc"&gt;Flag any claims you are less than confident about.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Review step:&lt;/strong&gt; Check source links. Verify any statistics against the original source before using them.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;How to compound:&lt;/strong&gt; Save a brief template in your workflows/ folder. After each brief, add any recurring sources (newsletters, databases, key authors) to your sources/key-links.md so Codex checks them by default.&lt;/p&gt;&lt;h3&gt;&lt;strong&gt;4. Writing with a parallel review loop&lt;/strong&gt;&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Best for:&lt;/strong&gt; Writers who want Codex running alongside them as they draft—checking the work, flagging issues, and responding in parallel without interrupting the writing session.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Input sources:&lt;/strong&gt; Your draft (open in your word processor through Codex’s in-app browser), any relevant style guides, source documents, or review standards in your workspace.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Output artifact:&lt;/strong&gt; An annotated draft with inline feedback, flagged issues, and suggested revisions—produced continuously as you write rather than in a single pass at the end.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Setup: &lt;/strong&gt;Open your draft in Proof or the in-app browser. Start a Codex session with your workspace context loaded. Give Codex standing instructions for what to monitor and how to respond.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;First prompt:&lt;/strong&gt;&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820529028-05kfk1"&gt;	I am writing [describe the piece—type, audience, purpose].&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820529028-05kfk1"&gt;As I draft, run a continuous review loop. Check for:&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820529028-05kfk1"&gt;- Claims that need a source or are stated with more confidence than the evidence supports&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820529028-05kfk1"&gt;- Passages where the argument loses clarity or the logic has a gap&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820529028-05kfk1"&gt;- Sentences that violate the style preferences in &lt;u&gt;&lt;a href="http://preferences.md" rel="noopener noreferrer" target="_blank"&gt;preferences.md&lt;/a&gt;&lt;/u&gt;&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820529028-05kfk1"&gt;- Anything that reads as filler, throat-clearing, or AI-generated phrasing&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820529028-05kfk1"&gt;Don’t rewrite anything without being asked. Flag issues as I go with a brief note on what the problem is and what would fix it. Check in every [X minutes / X paragraphs] or when I ask.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Review step:&lt;/strong&gt; Read the flagged issues at natural stopping points—the end of a section or session. Decide which to address and which to dismiss. Don’t let the feedback loop interrupt the drafting flow; the value is in the accumulation, not in responding to every flag in real time.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;How to compound:&lt;/strong&gt; After each writing session, add any recurring flags to your reviews/writing-checklist.md. Patterns that come up repeatedly are candidates for a standing rule in your preferences file, so Codex catches them automatically next time.&lt;/p&gt;&lt;h3&gt;&lt;strong&gt;5. Source management for research&lt;/strong&gt;&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Best for:&lt;/strong&gt; Writers and researchers who need to organize source material before drafting.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Input sources:&lt;/strong&gt; Links, PDFs, past drafts, notes, transcripts.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Output artifact:&lt;/strong&gt; A structured document with the core argument, supporting evidence organized by claim, counterarguments, and a gap analysis (what is still missing).&lt;/p&gt;&lt;p&gt;&lt;strong&gt;First prompt:&lt;/strong&gt;&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820559583-q3ha4f"&gt;	I am writing a piece on [topic]. The core argument I want to make is [argument].&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820559583-q3ha4f"&gt;	Here are my source materials: [links/documents].&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820559583-q3ha4f"&gt;	Build an evidence room that:&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820559583-q3ha4f"&gt;1. States the core argument clearly&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820559583-q3ha4f"&gt;	2. Lists the strongest supporting evidence for each main point, with source links&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820559583-q3ha4f"&gt;	3. Lists the strongest counterarguments and how I might address them&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820559583-q3ha4f"&gt;	4. Identifies any gaps—claims I am making that lack strong evidence&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820559583-q3ha4f"&gt;	5. Flags any sources that conflict with each other&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Review step:&lt;/strong&gt; Read the evidence room before drafting. Verify any statistics or quotes you plan to use directly.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;How to compound:&lt;/strong&gt; Save the evidence format as a workflow template. Add a standing note to your context file about your writing voice and recurring themes so Codex calibrates its framing.&lt;/p&gt;&lt;h3&gt;&lt;strong&gt;6. Information via audio&lt;/strong&gt;&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Best for:&lt;/strong&gt; Anyone who processes information better by listening than reading, or who wants to take time away from a screen but stay on top of work.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Input sources:&lt;/strong&gt; Any written content: drafts, research briefs, meeting summaries, strategy documents, reports, lengthy emails, articles.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Output artifact:&lt;/strong&gt; An audio file saved to a location accessible from your phone (Dropbox, Drive, etc.).&lt;/p&gt;&lt;p&gt;&lt;strong&gt;First prompt:&lt;/strong&gt;&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820573319-wgfep4"&gt;Convert the attached [document/draft/report] into a clear audio file. Read it at a natural pace—not rushed, not slow. Save it to [Dropbox/Drive location] as [filename].&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820573319-wgfep4"&gt;&lt;strong&gt;Review step:&lt;/strong&gt; Listen on your commute, walk, or wherever you have time away from a screen. Take notes on your phone as things come up. Return to the source material with whatever you noticed.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;How to compound:&lt;/strong&gt; Add a standing instruction to your context file covering your audio preferences—such as speed, file format, naming convention, and preferred save location—so you do not have to specify each time. You can also prompt Codex to convert content automatically at the end of certain workflows: “After generating the weekly metrics report, convert it to audio and save to [location].”&lt;/p&gt;&lt;h3&gt;&lt;strong&gt;7. Go-to-market plan generator&lt;/strong&gt;&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Best for:&lt;/strong&gt; Anyone responsible for launching a product, feature, or initiative and who has done the thinking in meetings and Slack but has not had time to formalize it.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Input sources:&lt;/strong&gt; Meeting transcripts, Slack threads, customer notes, a preferred strategy template.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Output artifact:&lt;/strong&gt; A complete go-to-market plan, structured for human review and agent querying.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;First prompt:&lt;/strong&gt;&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820595232-4q50ag"&gt;	Build a go-to-market plan for [product/initiative]. &lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820595232-4q50ag"&gt;	Sources to pull from:&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820595232-4q50ag"&gt;	- Meeting transcripts: [Notion location or links]&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820595232-4q50ag"&gt;	- Slack discussions: [channels or search terms]&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820595232-4q50ag"&gt;	- Customer research: [document or location]&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820595232-4q50ag"&gt;	- Template to follow: [link or paste template]&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820595232-4q50ag"&gt;The plan should be readable by a human in five minutes and structured so that an agent can answer specific questions about it (e.g., “What is the target ICP?” “What is the launch timeline?”).&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820595232-4q50ag"&gt;Start with a compound engineering brainstorm step. Give me a draft in Proof or Notion. Flag anything in the plan you added that was not in the source material—I only want synthesis of what we have already decided, not new suggestions baked in.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Review step:&lt;/strong&gt; Review in Notion or Proof. Verify that every major claim traces to something in the source material. Anything the model added that was not in your sources should be flagged for your decision.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;How to compound:&lt;/strong&gt; Save the template and the prompt. After each launch, add a retrospective note to your context file about what the plan got right and wrong. Future plans will be calibrated by past ones.&lt;/p&gt;&lt;h3&gt;&lt;strong&gt;8. KPI report&lt;/strong&gt;&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Best for:&lt;/strong&gt; Anyone responsible for tracking metrics and needing a regular, reliable view across multiple data sources.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Input sources:&lt;/strong&gt; Analytics (PostHog, Mixpanel, Amplitude), revenue data (Stripe), support volume, social metrics, saved past reports.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Output artifact:&lt;/strong&gt; A one-page report covering headlines, usage metrics, system health, and follow-up items.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;First prompt:&lt;/strong&gt;&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820607666-pyt5wv"&gt;	Generate a product pulse report for [time period].&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820607666-pyt5wv"&gt;	Data sources:&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820607666-pyt5wv"&gt;	- Product analytics: [tool and what to pull]&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820607666-pyt5wv"&gt;	- Revenue: [tool and what to pull]&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820607666-pyt5wv"&gt;	- Support: [tool and what to pull]&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820607666-pyt5wv"&gt;	- Social: [tool and what to pull]&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820607666-pyt5wv"&gt;	Structure:&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820607666-pyt5wv"&gt;	1. Headlines (three to five bullets summarizing what matters most)&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820607666-pyt5wv"&gt;2. Usage (primary engagement metric, value-realization metric, conversions, deltas vs. prior period)&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820607666-pyt5wv"&gt;3. System health (error rates, latency, top error signatures)&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820607666-pyt5wv"&gt;	4. Follow-ups (one to five things worth investigating, specific enough to act on)&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820607666-pyt5wv"&gt;Flag any number that differs significantly from the prior report. If something is anomalous, investigate one level deeper before including it.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Review step:&lt;/strong&gt; Verify every number in the report against its source. Don’t use this report as a business source of truth until you have confirmed accuracy column by column. In practice, one-shot metrics pulls are often five to 10 percent off—a common result of definition mismatches and join errors across multi-source pulls.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;How to compound:&lt;/strong&gt; Save each report as a dated file in your outputs/reports/ folder. Over time, Codex can compare reports, identify trends, and flag when something has changed. The folder becomes the working memory of your product.&lt;/p&gt;&lt;h3&gt;&lt;strong&gt;9. Customer support for product work&lt;/strong&gt;&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Best for:&lt;/strong&gt; Teams where support patterns should feed into product decisions and small fixes.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Input sources:&lt;/strong&gt; Support platform (Intercom, Zendesk), issue tracker (Linear, GitHub Issues).&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Output artifact:&lt;/strong&gt; A deduplicated list of issues with suggested priority, plus small issues ready to hand off for fixes.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;First prompt:&lt;/strong&gt;&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820617261-k37uky"&gt;	Go through my support queue for the past [time period]. &lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820617261-k37uky"&gt;	For each support thread:&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820617261-k37uky"&gt;	1. Identify the underlying issue or request.&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820617261-k37uky"&gt;	2. Check whether a similar issue already exists in [Linear/GitHub Issues].&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820617261-k37uky"&gt;	3. If it does, link them. If it doesn’t, draft a new issue.&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820617261-k37uky"&gt;	4. Flag any issue that appears more than [threshold] times—these are priorities.&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820617261-k37uky"&gt;	5. For issues that appear straightforward to fix, note that they are candidates &lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820617261-k37uky"&gt;	   for direct implementation.&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820617261-k37uky"&gt;	Don’t create issues in the tracker yet. Give me the list to review first.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Review step:&lt;/strong&gt; Review the issue list before anything goes into the tracker. Confirm deduplication is accurate—support tickets often describe the same underlying problem in different words.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;How to compound:&lt;/strong&gt; After each session, add a note about recurring issue types so Codex can categorize faster next time. Build a persistent list of known issues so deduplication improves over time.&lt;/p&gt;&lt;h3&gt;&lt;strong&gt;10. Pull requests for non-engineers&lt;/strong&gt;&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Best for:&lt;/strong&gt; Anyone who needs to make a small, well-scoped change to a codebase—such as copy updates, configuration changes, or content edits—without deep engineering knowledge.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Input sources:&lt;/strong&gt; The relevant files or repository, and a clear description of the change.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Output artifact:&lt;/strong&gt; A pull request (PR) that is reviewer-friendly and doesn’t touch anything outside the intended scope.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;First prompt:&lt;/strong&gt;&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820628006-6q08r6"&gt;	I need to make the following change: [describe the change clearly].&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820628006-6q08r6"&gt;	Before making any changes:&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820628006-6q08r6"&gt;1. Show me which files are affected&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820628006-6q08r6"&gt;2. Confirm the scope of the change—nothing outside these files should be touched&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820628006-6q08r6"&gt;	3. Explain what you are going to do in plain language before doing it&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820628006-6q08r6"&gt;	After making the change:&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820628006-6q08r6"&gt;	1. Summarize what was changed and why&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820628006-6q08r6"&gt;	2. List every file that was touched&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820628006-6q08r6"&gt;	3. Explain how you verified the change is correct&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820628006-6q08r6"&gt;	4. Flag anything a reviewer should look at carefully&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820628006-6q08r6"&gt;	Make the smallest useful change. Don’t refactor or improve anything adjacent.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Review step:&lt;/strong&gt; Review the Codex preview before the PR is opened. Review the PR itself in GitHub or your code review tool. Ask a technical colleague to approve before merging if you are uncertain.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;How to compound:&lt;/strong&gt; Save a template of your preferred PR format. After each PR, add a note about anything that requires correction so future PRs avoid the same issue.&lt;/p&gt;&lt;h3&gt;&lt;strong&gt;11. Recruiting research&lt;/strong&gt;&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Best for:&lt;/strong&gt; Anyone doing outbound recruiting for a role with a specific background profile.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Input sources:&lt;/strong&gt; LinkedIn, Twitter/X, company websites, alumni databases, public professional networks.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Output artifact:&lt;/strong&gt; A list of candidates with background summaries and contact information or connection points.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;First prompt:&lt;/strong&gt;&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820638904-lqg7gg"&gt;	I am hiring for [role]. The ideal candidate has [background profile—experience, &lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820638904-lqg7gg"&gt;	prior companies, skills, career trajectory].&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820638904-lqg7gg"&gt;	Search for candidates who match this profile. For each candidate:&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820638904-lqg7gg"&gt;	1. Summarize their background in two to three sentences&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820638904-lqg7gg"&gt;	2. Note why they match the profile&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820638904-lqg7gg"&gt;	3. Identify any connection point (mutual connections, follows, shared affiliations)&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820638904-lqg7gg"&gt;	4. Provide a link to their public profile&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820638904-lqg7gg"&gt;	Return the top [number] candidates, ranked by how closely they match the profile.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Review step:&lt;/strong&gt; Review each candidate before any outreach. Verify that the background summaries are accurate by checking the linked profiles. Don’t send any outreach through Codex.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;How to compound:&lt;/strong&gt; Save the role profile as a template. After a successful hire, document what the actual background looked like versus the initial profile to calibrate future searches.&lt;/p&gt;&lt;h3&gt;&lt;strong&gt;12. Strategy and planning agent&lt;/strong&gt;&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Best for:&lt;/strong&gt; Leaders and operators who need to compress OKR planning, quarterly planning, or strategic reviews from days to hours.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Input sources:&lt;/strong&gt; Past planning documents, meeting transcripts, leadership context notes, relevant metrics.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Output artifact:&lt;/strong&gt; A draft plan or OKR set, structured for review and iteration.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;First prompt:&lt;/strong&gt;&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820656796-4vc62h"&gt;	I need to draft [quarterly plan / OKR set / strategic review] for [scope].&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820656796-4vc62h"&gt;	Pull from:&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820656796-4vc62h"&gt;	- Past plans: [location]&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820656796-4vc62h"&gt;	- Recent meeting transcripts: [location]&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820656796-4vc62h"&gt;	- Current metrics: [location or description]&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820656796-4vc62h"&gt;	- Leadership context: [document or description]&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820656796-4vc62h"&gt;	Structure the output as [desired format].&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820656796-4vc62h"&gt;Flag any goal or initiative you are recommending that doesn’t have explicit support in the source material. I want synthesis of what has already been decided, not new recommendations baked in without my review.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Review step:&lt;/strong&gt; Review in Notion or Proof. Before sharing with leadership or the team, confirm that every major commitment traces to a decision that was actually made.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;How to compound:&lt;/strong&gt; After each planning cycle, add a retrospective to your context file. Did the goals prove achievable? What was missing from the original plan? Future planning sessions will be informed by past ones.&lt;/p&gt;&lt;h3&gt;&lt;strong&gt;13. Personal learning tool&lt;/strong&gt;&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Best for:&lt;/strong&gt; Anyone who wants to use Codex to support skill-building, practice, or self-directed learning.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Input sources:&lt;/strong&gt; External APIs, files, structured practice materials, your own notes.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Output artifact:&lt;/strong&gt; A custom interactive tool—like a tutor, a quiz, or a practice environment—built for your learning goal.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Example: &lt;/strong&gt;A musician wants to practice chord identification. They connect a MIDI keyboard and describe what they want, and Codex builds a small app that listens to what they play, identifies the chord, and tracks progress over time. &lt;/p&gt;&lt;p&gt;&lt;strong&gt;First prompt:&lt;/strong&gt;&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820674573-yry42c"&gt;	I want to build a personal learning tool for [skill or subject]. &lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820674573-yry42c"&gt; 	My current level: [beginner/intermediate/what I know already].&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820674573-yry42c"&gt;	What I want to practice: [specific aspect of the skill].&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820674573-yry42c"&gt;	How I want feedback: [immediate/after each session/scored].&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820674573-yry42c"&gt;	Build a prototype I can use locally. Explain what it does and how to use it before I start.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Review step:&lt;/strong&gt; Try the tool on real practice material before committing to it. Verify it is actually testing what you intended.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;How to compound:&lt;/strong&gt; After each practice session, ask Codex to update the tool based on what you found most and least useful. The tool improves as your needs become clearer.&lt;/p&gt;&lt;h2&gt;&lt;strong&gt;Part 5: Operating Codex well&lt;/strong&gt;&lt;/h2&gt;&lt;h3&gt;&lt;strong&gt;How to Steer Codex&lt;/strong&gt;&lt;/h3&gt;&lt;p&gt;Operating Codex well is management work. You evaluate talent (which prompts, agents, and workflows to trust), set vision (what to point Codex at, and what “done” should look like), exercise taste (catching output that is technically correct but wrong for the moment), and know when to let be or take the wheel.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Give Codex an outcome.&lt;/strong&gt; Describe what you want to end up with, not how to get there. “Build a research brief on [topic] with these sources and this structure” produces better results than “First search Slack, then search Notion, then...”&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Ask for a plan before long-running work.&lt;/strong&gt; For any task that will take more than a few minutes or touch multiple systems, ask Codex to explain what it’s about to do before it starts. This catches misunderstandings early and gives you a chance to redirect it before it gets too far along.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Ask Codex what it needs before it starts.&lt;/strong&gt; For complex tasks, a short briefing prompt saves time:&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820795697-av5ui3"&gt;Before you start, tell me what additional context would help you do this better. What are the most important things you would want to know?&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Require citations and audit trails for important claims.&lt;/strong&gt; Any document that will be shared or used for decisions should have source links for factual claims. Make this a standing rule in your preferences file.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Don’t over-manage every micro-step once the plan is good.&lt;/strong&gt; Once you have confirmed the approach, let Codex work. Interrupting undermines autonomous operation and produces worse results than reviewing the completed output.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Review in the destination app.&lt;/strong&gt; Always.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Set explicit no-send /no-post/no-archive/no-modify rules in your rules file.&lt;/strong&gt; These should apply by default to any sensitive workflow. Make Codex ask before taking any action that can’t easily be undone.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Three questions to ask before approving any significant output:&lt;/strong&gt;&lt;/p&gt;&lt;p data-guide-block-kind="callout" data-guide-block-id="guide-block-1779820810918-pg97kr"&gt;What was the hardest decision you made in producing this?&lt;/p&gt;&lt;p data-guide-block-kind="callout" data-guide-block-id="guide-block-1779820810918-pg97kr"&gt;What alternatives did you consider and reject?&lt;/p&gt;&lt;p data-guide-block-kind="callout" data-guide-block-id="guide-block-1779820810918-pg97kr"&gt;Where are you least confident?&lt;/p&gt;&lt;p&gt;These questions surface the judgment calls the model made, the options it dismissed, and the places most likely to contain errors.&lt;/p&gt;&lt;h3&gt;&lt;strong&gt;Safety, trust, and risks&lt;/strong&gt;&lt;/h3&gt;&lt;h4&gt;&lt;strong&gt;Risk categories&lt;/strong&gt;&lt;/h4&gt;&lt;p&gt;&lt;strong&gt;Green—proceed with standard review:&lt;/strong&gt; Summaries, outlines, internal drafts, research briefs, personal notes, low-stakes scripts.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Yellow—review carefully before sharing or acting:&lt;/strong&gt; Strategy documents, customer-support drafts, product specs, recruiting research, non-destructive data pulls, PR drafts for small changes.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Red—don’t proceed without explicit human verification:&lt;/strong&gt; Sending messages to clients or customers, changing source-of-truth data, making production code changes, moving money, legal or compliance claims, unreconciled metrics used for business decisions.&lt;/p&gt;&lt;h4 data-guide-block-kind="callout" data-guide-block-id="guide-block-1779820870615-20fyx4"&gt;&lt;strong&gt;Common failure modes and how to handle them&lt;/strong&gt;&lt;/h4&gt;&lt;p data-guide-block-kind="callout" data-guide-block-id="guide-block-1779820870615-20fyx4"&gt;&lt;em&gt;Confident wrongness.&lt;/em&gt; Codex can state incorrect facts with high confidence. For any factual claim that matters, verify against the source. Never pass a statistic or claim to another person without checking it.&lt;/p&gt;&lt;p data-guide-block-kind="callout" data-guide-block-id="guide-block-1779820870615-20fyx4"&gt;&lt;em&gt;Metrics errors.&lt;/em&gt; Joining data from multiple sources introduces definition mismatches and calculation errors. Verify column by column for any metric used in decisions.&lt;/p&gt;&lt;p data-guide-block-kind="callout" data-guide-block-id="guide-block-1779820870615-20fyx4"&gt;&lt;em&gt;Out-of-scope changes.&lt;/em&gt; Codex sometimes modifies files or makes improvements adjacent to the task you assigned. Review the changes line by line (called a “diff”), not just the final output, especially for any task involving code.&lt;/p&gt;&lt;p data-guide-block-kind="callout" data-guide-block-id="guide-block-1779820870615-20fyx4"&gt;&lt;em&gt;Automations that break.&lt;/em&gt; Persistent workflows stop working when tools update their APIs, credentials expire, or context files become stale. Every automation needs an owner who checks it periodically. Sever that connection—stop tending it—and the agent stops being useful. “Set it and forget it” isn’t a stable operating mode.&lt;/p&gt;&lt;p data-guide-block-kind="callout" data-guide-block-id="guide-block-1779820870615-20fyx4"&gt;&lt;em&gt;Plugin and integration failure.&lt;/em&gt; Plugins and integrations need maintenance: Permissions expire, APIs change, configurations need updates, and some changes require restarting Codex. Integration failures—particularly with Notion and Gmail—happen and aren’t always obvious. If a workflow produces strange output, check whether the connection is still working before assuming the prompt is wrong.&lt;/p&gt;&lt;p data-guide-block-kind="callout" data-guide-block-id="guide-block-1779820870615-20fyx4"&gt;&lt;em&gt;Usage limits.&lt;/em&gt; Long-running sessions can hit usage limits and stop mid-task. For complex workflows, break work into stages rather than attempting everything in a single session.&lt;/p&gt;&lt;p data-guide-block-kind="callout" data-guide-block-id="guide-block-1779820870615-20fyx4"&gt;&lt;em&gt;Untrusted input&lt;/em&gt;. Anything Codex reads—an email, a web page, a shared document, a support ticket—can contain instructions aimed at the agent rather than at you, sometimes hidden from human eyes. If Codex is browsing untrusted sites or processing external messages while holding broad write access, those buried instructions can turn into actions—like sending data where it shouldn’t go. So keep destructive actions behind approval, and scope each workflow to the least access it needs, so a hijacked instruction has nowhere to go.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;The human ownership standard:&lt;/strong&gt; Codex can touch any artifact in your workflow, but a human must direct the work, stand behind the output, and be able to discuss any specific decision in it. If someone asks you about a bullet point in a document Codex drafted, you should be able to answer. An AI-drafted document is fine—expected, even—but if someone talks it through with you and it’s clear you have no idea what’s in it, that’s a problem.&lt;/p&gt;&lt;h3&gt;&lt;strong&gt;Team workflows: From personal Codex to shared operating system&lt;/strong&gt;&lt;/h3&gt;&lt;p&gt;Individual Codex workflows compound over time. Team workflows compound faster but require coordination.&lt;/p&gt;&lt;h4&gt;&lt;strong&gt;What changes when a team uses Codex&lt;/strong&gt;&lt;/h4&gt;&lt;p&gt;Teams build trust in agents through the humans who operate them. When a colleague receives a document or plan that Codex drafted, they trust it to the degree they trust the person who shared it.&lt;/p&gt;&lt;h4&gt;&lt;strong&gt;Infrastructure that makes team Codex work&lt;/strong&gt;&lt;/h4&gt;&lt;p&gt;&lt;em&gt;Shared review surfaces.&lt;/em&gt; A shared document review tool (Proof, Notion, Google Docs) makes agent-generated documents easier to inspect and comment on than outputs reviewed only inside Codex.&lt;/p&gt;&lt;p&gt;&lt;em&gt;Codex-mediated routing. &lt;/em&gt;Teams can combine Codex threads, automations, Slack or GitHub integrations, remote connections, and app-server APIs to build routing workflows: Requests arrive in Slack, email, or another shared intake surface; Codex helps triage them, creates reviewable tasks or drafts, and routes the work to the right human or Codex workspace for execution. Each route needs clear ownership, permissions, review rules, and a source of truth. For teams doing a lot of cross-functional requests, such as legal reviews, data pulls, or copy approvals, this pattern removes significant coordination overhead.&lt;/p&gt;&lt;p&gt;A key mechanic to making this style of work possible is giving Codex its own email address. Codex doesn’t come with one—you set it up with a tool like &lt;u&gt;&lt;a href="https://www.nylas.com/" rel="noopener noreferrer" target="_blank"&gt;Nylas&lt;/a&gt;&lt;/u&gt; that gives an agent an inbox. Once it has that address, you can treat it like another teammate. Routes built on an email address still need the same discipline as any other: a clear owner, scoped permissions, and a review step before anything goes back out.&lt;/p&gt;&lt;p&gt;&lt;em&gt;Agent-readable shared documentation.&lt;/em&gt; Plans, strategy documents, and operational guides written for both human and agent readers become shared infrastructure. Any team member—or any team member’s agent—can query them for specific information without interrupting the author.&lt;/p&gt;&lt;p&gt;&lt;em&gt;Explicit ownership.&lt;/em&gt; Every persistent workflow needs a named owner. That person is responsible for monitoring output quality, updating the workflow when it breaks, and retiring it when it’s no longer useful. Automation degrades without ownership.&lt;/p&gt;&lt;h4 data-guide-block-kind="callout" data-guide-block-id="guide-block-1779820923832-39cry6"&gt;A simple way to get a team to use Codex&lt;/h4&gt;&lt;p data-guide-block-kind="callout" data-guide-block-id="guide-block-1779820923832-39cry6"&gt;Don’t try to convert everyone. As a rule of thumb, a tenth of any team will adopt a new tool no matter what, a tenth never will, and the other 80 percent come along once someone shows them how it helps their own job. Aim at that 80 percent. Three things, done together, help along adoption:&lt;/p&gt;&lt;ol&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1779820923832-39cry6"&gt;A note from a leader that makes using AI the expectation, not a nice-to-have&lt;/li&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1779820923832-39cry6"&gt;A weekly meeting where anyone can show a prompt or workflow they’ve built&lt;/li&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1779820923832-39cry6"&gt;A regular message that names the people whose work stood out &lt;/li&gt;&lt;/ol&gt;&lt;p data-guide-block-kind="callout" data-guide-block-id="guide-block-1779820923832-39cry6"&gt;Set the expectation, give people a place to share what works, and recognize them for it—that’s most of the battle.&lt;/p&gt;&lt;h2&gt;&lt;strong&gt;Part 6: Getting started&lt;/strong&gt;&lt;/h2&gt;&lt;h3&gt;&lt;strong&gt;The seven-day Codex power-user plan &lt;/strong&gt;&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Day 1: Connect and inspect.&lt;/strong&gt; Install the Codex desktop app. Connect your primary tools—Gmail, Slack, Notion, Drive, and any analytics or support tools you use. Run the workflow discovery prompt from Part 2 and review the three automation suggestions Codex returns. Don’t build anything yet. Just read the suggestions and identify which one is most useful.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Day 2: Create your context files.&lt;/strong&gt; Create your codex-workspace/ folder. Write context.md, preferences.md, and rules.md. Keep each one to one page. The goal is to capture the most important things Codex should know about you—not to be exhaustive.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Day 3: Run three one-off tasks.&lt;/strong&gt; Choose one summary task, one research brief, and one draft or plan. Use the prompt patterns from Level 1. Review each output carefully and note where Codex got things right and where it needed correction.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Day 4: Build your first workflow.&lt;/strong&gt; Take the most useful automation suggestion from Day 1 and fill out the workflow canvas from Level 3. Save it to workflows/ in your workspace. Run it once manually and verify the output.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Day 5: Add review rules.&lt;/strong&gt; Create reviews/data-checklist.md, reviews/writing-checklist.md, and reviews/comms-checklist.md. Start each one with five checks based on what you noticed during Days 3 and 4. These will grow over time.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Day 6: Turn one workflow into a reusable artifact.&lt;/strong&gt; Take the workflow from Day 4 and document the prompt, the output format, the review step, and any known edge cases. Save it as a complete workflow file. Run it again and verify the documentation is accurate.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Day 7: Compound.&lt;/strong&gt; Run the compounding prompt at the end of your Codex session:&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820956716-yzcjc0"&gt;Based on everything we have done this week, what should become a reusable skill,&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820956716-yzcjc0"&gt;an automation, or a small tool? What context should I add to my project files&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820956716-yzcjc0"&gt;so future sessions start from a better baseline?&lt;/p&gt;&lt;p&gt;Review Codex’s suggestions and implement the one that would save the most time over the next month.&lt;/p&gt;&lt;p data-guide-block-kind="callout" data-guide-block-id="guide-block-1779820994207-1bypew"&gt;&lt;strong&gt;30-day extension:&lt;/strong&gt;&lt;/p&gt;&lt;ul&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1779820994207-1bypew"&gt;Week 1: One personal workflow running reliably&lt;/li&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1779820994207-1bypew"&gt;Week 2: One multi-source workflow pulling from at least three connected tools&lt;/li&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1779820994207-1bypew"&gt;Week 3: One small tool or automation that handles a chore without your involvement&lt;/li&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1779820994207-1bypew"&gt;Week 4: One shared or team workflow with explicit ownership and review cadence&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;Start today. Connect the tools you’re comfortable permissioning and ask Codex what recurring workflows it can see from the available context. That question, and what you do with the answer, is the gateway to the Codex universe.&lt;/p&gt;&lt;p&gt;Codex is easy to underestimate. At first glance it looks like another AI coding tool; if you’re not an engineer, a natural conclusion is that it’s not for you.&lt;/p&gt;&lt;p&gt;That reading misses how much Codex makes possible. &lt;/p&gt;&lt;p&gt;Picture a Monday morning: A request for a launch plan lands in your inbox. You forward it to Codex, which has its own email account, and close your laptop while Codex runs tasks in the cloud, or on a machine like a Mac Mini that you keep active. On your commute to the office, you get an email notification on your phone: Codex has read the relevant Slack threads, pulled customer notes out of Google Drive, checked last quarter’s numbers in PostHog, and started a go-to-market plan in a shared Notion document. It just needs you to confirm one detail about timing, which you do with a thumbs-up. By the time you reach your desk, a draft is waiting for review. &lt;/p&gt;&lt;p&gt;This is a day in the life of an agent-pilled knowledge worker. It all runs on OpenAI’s agent, Codex, in the Codex desktop app. We use “Codex” to refer to the app throughout this guide. &lt;/p&gt;&lt;p&gt;Codex is a workspace for you and your AI agents. Give Codex access to the files, apps, and tools it needs, and it gathers context, moves through the task across every surface it can reach—including your connected apps, the browser, and your computer. That makes it useful not just for code, but for a broad range of knowledge work.&lt;/p&gt;&lt;p&gt;There are two ways to work with agents in Codex: &lt;strong&gt;Delegate&lt;/strong&gt; or &lt;strong&gt;collaborate.&lt;/strong&gt;&lt;/p&gt;&lt;ul&gt;&lt;li&gt;&lt;strong&gt;Delegate&lt;/strong&gt; tasks that are predictable, repeatable, and low-risk. With clear, well-specified instructions, the agent can execute autonomously and bring back finished work for your review.&lt;/li&gt;&lt;li&gt;&lt;strong&gt;Collaborate&lt;/strong&gt; on tasks that are judgment-heavy, exploratory, or iterative. You work alongside the model toward an outcome that matches your vision.&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;AI progress has reached a point where expertise is easy to replicate. Each new model can do more of what used to require rare skill—which creates both more opportunity and more noise. The people who work best in this environment know how to direct AI’s capability without losing their personal judgment. They &lt;u&gt;&lt;a href="https://every.to/p/after-automation" rel="noopener noreferrer" target="_blank"&gt;ride the models&lt;/a&gt;&lt;/u&gt; rather than being overwhelmed by them.&lt;/p&gt;&lt;p&gt;Expert Codex users are one of the clearest examples of what that looks like in practice.&lt;/p&gt;&lt;p&gt;This guide is about becoming one of those people. It covers how to set up a workspace, run high-leverage knowledge-work tasks, and turn repeated work into durable systems that get better over time. If you’re ready to think of your work in terms of systems instead of one-off tasks, this guide is for you.&lt;/p&gt;&lt;p data-guide-block-kind="agent-buttons" data-guide-block-id="guide-block-1779827761591-u9k6gl"&gt;&lt;br&gt;&lt;/p&gt;&lt;h2&gt;&lt;strong&gt;Part 1: Understanding Codex&lt;/strong&gt;&lt;/h2&gt;&lt;h3&gt;What Codex is&lt;/h3&gt;&lt;p&gt;Codex is a tool-using agentic workspace: You give it a goal and it plans the work, uses available tools and context, and produces a result for you to review. It can read and write files on your computer, connect to external services through plugins and other integrations, run multi-step tasks without asking for guidance, generate code and scripts when a task needs them, and maintain context across a persistent workspace.&lt;/p&gt;&lt;p&gt;Specific capabilities that make Codex worth using: &lt;/p&gt;&lt;ul&gt;&lt;li&gt;Works alongside you on multiple tasks in parallel&lt;/li&gt;&lt;li&gt;Pulls context from the apps and files you connect&lt;/li&gt;&lt;li&gt;Uses a supported browser and desktop workflows when a task needs on-screen action&lt;/li&gt;&lt;li&gt;Checks its own work, revises, and keeps going&lt;/li&gt;&lt;li&gt;Holds a persistent goal across a long-running session, instead of treating each message as a one-off request&lt;/li&gt;&lt;li&gt;Turns repeatable tasks into recurring workflows&lt;/li&gt;&lt;li&gt;Helps route shared requests from places like Slack, email, or forms&lt;/li&gt;&lt;li&gt;Lets you start, steer, approve, and review work from your phone while Codex works in the cloud or on a machine, such as a Mac Mini, that you keep awake&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;These capabilities make Codex useful both for delegating well-specified tasks and as a shared workspace for human-agent collaboration. Deciding which mode fits which needs is the meta-skill of modern knowledge work.&lt;strong&gt; &lt;/strong&gt;&lt;/p&gt;&lt;h4 data-guide-block-kind="callout" data-guide-block-id="guide-block-1779817335666-b66wia"&gt;&lt;strong&gt;A note on Goals&lt;/strong&gt;&lt;/h4&gt;&lt;p data-guide-block-kind="callout" data-guide-block-id="guide-block-1779817335666-b66wia"&gt;A Goal in Codex, initiated using the &lt;code&gt;/goal&lt;/code&gt; command, is a persistent objective that shapes an entire session rather than living and dying with a single message. Instead of re-briefing the agent on every turn, you tell it what “done” looks like, how success gets checked, and which constraints to respect. Codex then keeps working toward that outcome across interruptions and session breaks. Goals let you delegate long-horizon work, collaborate without losing the thread, and compound progress over time instead of restarting from scratch. &lt;/p&gt;&lt;p data-guide-block-kind="callout" data-guide-block-id="guide-block-1779817335666-b66wia"&gt;A simple test for when to use &lt;code&gt;/goal&lt;/code&gt;: If you’d type the same sentence into three prompts in a row—“cite every factual claim, match the house style, never send without my review”—make it a goal instead.&lt;/p&gt;&lt;p data-guide-block-kind="callout" data-guide-block-id="guide-block-1779817335666-b66wia"&gt;&lt;strong&gt;Goals versus skills. &lt;/strong&gt;A skill is a reusable set of packaged instructions (sometimes with scripts) that teaches Codex how to handle a recurring kind of task well. A goal, on the other hand, is what you’re trying to accomplish in a given stretch of work. It guides one session until the objective is met, then it’s done.&lt;/p&gt;&lt;h4&gt;&lt;strong&gt;Codex on mobile&lt;/strong&gt;&lt;/h4&gt;&lt;p&gt;Codex also runs from your phone through the ChatGPT mobile app, remotely controlling the machine where your work is happening. The mobile app suits the lightweight parts of a workflow: You can kick off a task, answer a question, approve an action, or review a draft from anywhere. Heavier review still deserves a real screen.&lt;/p&gt;&lt;h3&gt;What Codex isn’t&lt;/h3&gt;&lt;p&gt;Codex isn’t a magic intern that can safely act without supervision. It isn’t a replacement for taste, judgment, or ownership. It isn’t a replacement for human review or fact-checking. It isn’t useful for tasks where the source data is inaccessible, the criteria for success are entirely subjective, or the stakes of an error are too high to allow autonomous action.&lt;/p&gt;&lt;h4&gt;&lt;strong&gt;Useful rules&lt;/strong&gt;&lt;/h4&gt;&lt;p&gt;A task is a good candidate for Codex if it has at least two of the following traits:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;It requires pulling data from multiple sources.&lt;/li&gt;&lt;li&gt;It involves repeated steps you do regularly.&lt;/li&gt;&lt;li&gt;It can be checked against objective criteria.&lt;/li&gt;&lt;li&gt;It produces a durable artifact—a document, a plan, a report, a script.&lt;/li&gt;&lt;li&gt;It benefits from synthesis across many inputs.&lt;/li&gt;&lt;li&gt;It’s annoying enough that you routinely delay or avoid it.&lt;/li&gt;&lt;/ul&gt;&lt;p data-guide-block-kind="callout" data-guide-block-id="guide-block-1779817287018-p1yi0r"&gt;&lt;strong&gt;Delegate &lt;/strong&gt;tasks&lt;strong&gt; &lt;/strong&gt;when they are: &lt;/p&gt;&lt;ul&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1779817287018-p1yi0r"&gt;Repeatable&lt;/li&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1779817287018-p1yi0r"&gt;Objective&lt;/li&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1779817287018-p1yi0r"&gt;Checkable&lt;/li&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1779817287018-p1yi0r"&gt;Low-risk&lt;/li&gt;&lt;/ul&gt;&lt;p data-guide-block-kind="callout" data-guide-block-id="guide-block-1779817287018-p1yi0r"&gt;&lt;strong&gt;Collaborate &lt;/strong&gt;on tasks that are: &lt;/p&gt;&lt;ul&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1779817287018-p1yi0r"&gt;Ambiguous&lt;/li&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1779817287018-p1yi0r"&gt;Judgment-heavy&lt;/li&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1779817287018-p1yi0r"&gt;Exploratory&lt;/li&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1779817287018-p1yi0r"&gt;Iterative&lt;/li&gt;&lt;/ul&gt;&lt;h4&gt;&lt;strong&gt;Codex, Claude Code, and Claude Cowork&lt;/strong&gt;&lt;/h4&gt;&lt;p&gt;If you’ve used Claude Code, you already have a mental model for an agent that works on your machine. For broader knowledge work, OpenAI and Anthropic have arrived at a similar experience from different directions. &lt;/p&gt;&lt;p&gt;Anthropic packages everything into one Claude app with three modes: Chat, Code, and Cowork. Code began as a terminal tool for developers (Claude Code) and now has a graphical version inside the app—no terminal required. It’s built for code repositories, but with the right connectors it handles a lot of general knowledge work too. Cowork takes the same engine and aims it at non-coding work, with folder access, Chrome browsing, computer use, scheduled tasks, and persistent project memory.&lt;/p&gt;&lt;p&gt;Codex is OpenAI’s counterpart, but rather than split the work across modes, it puts coding and knowledge work in a single workspace. A few things give Codex an edge for knowledge work today:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;&lt;strong&gt;One surface, not two.&lt;/strong&gt; Anthropic splits agentic work between Code and Cowork; Codex handles both in the same place, so you’re never deciding which mode a task belongs in.&lt;/li&gt;&lt;li&gt;&lt;strong&gt;A browser that works beside you.&lt;/strong&gt; Codex renders the pages inside the app itself as a shared view between you and the agent. The Claude app operates a stand-alone Chrome window or your full screen instead. For logged-in sites, both rely on a Chrome extension. In our experience, Codex’s built-in browser tends to be faster, more reliable, and more useful for collaborative work.&lt;/li&gt;&lt;li&gt;&lt;strong&gt;Connectors out of the box. &lt;/strong&gt;Codex comes with a catalog of connectors you authorize in a click; in the Claude app you add tools as MCP servers, which requires a bit more assembly. &lt;/li&gt;&lt;/ul&gt;&lt;p&gt;Which surface is right comes down to model preference and workflow habits; Codex has the edge for us today—but the labs ship fast, and that can change.&lt;/p&gt;&lt;h3&gt;&lt;strong&gt;The Codex knowledge work loop&lt;/strong&gt;&lt;/h3&gt;&lt;p&gt;Every sustainable Codex workflow follows the same five-step pattern:&lt;/p&gt;&lt;h3 data-guide-block-kind="callout" data-guide-block-id="guide-block-1779817436733-2mrhhk"&gt;Connect → Contextualize → Delegate/collaborate → Review → Compound&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Connect: &lt;/strong&gt;Give Codex access to the systems you use for work—Gmail, Slack, Notion, Google Drive, your calendar, your analytics tools, your support platform, and/or local files. Without connected apps or source access, Codex is limited to the local/project files it can access, uploaded or linked materials, and context you provide in the thread. With connections, it can find what it needs on its own.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Contextualize:&lt;/strong&gt; Put your goals, preferences, project details, source links, review standards, and standing rules in files Codex can access, then cite those files in Codex’s &lt;u&gt;&lt;a href="http://agents.md" rel="noopener noreferrer" target="_blank"&gt;AGENTS.md&lt;/a&gt;&lt;/u&gt; file to make them readily available. This is the difference between an agent that has to be re-briefed every time and one that already understands who you are, what you’re working on, and how you like to work.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Delegate/collaborate:&lt;/strong&gt; Decide whether the task needs close collaboration or can run on its own. Either way, specify inputs, output format, and acceptance criteria, then let it work.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Review:&lt;/strong&gt; Check the output in the destination app. If Codex drafted Slack messages, review them in Slack. If it wrote a strategy document, review it in your word processor of choice, such as Google Docs, Notion, or &lt;strong&gt;&lt;u&gt;&lt;a href="https://proofeditor.ai/" rel="noopener noreferrer" target="_blank"&gt;Proof&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;. Content that looks fine in a terminal or the Codex app may read differently in the space where it will ultimately be used.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Compound:&lt;/strong&gt; Turn what works into something reusable. Save the prompt. Document the workflow. Add mistakes to your review checklist and keep your context files up to date. Each session should make future sessions faster.&lt;/p&gt;&lt;h2&gt;&lt;strong&gt;Part 2: Setup&lt;/strong&gt;&lt;/h2&gt;&lt;h3&gt;&lt;strong&gt;Connect your systems&lt;/strong&gt;&lt;/h3&gt;&lt;p&gt;Connect the tools you want Codex to have access to. This includes Gmail, Slack, Notion, Google Drive, your calendar, analytics tools, support platforms, or anything else for which Codex has an integration. Once the relevant tools are connected, Codex can look at your actual work context and suggest workflows based on your messages, files, meetings, and recurring tasks.&lt;/p&gt;&lt;p&gt;Connecting a tool isn’t the same thing as letting Codex act on it. Across everything you connect, Codex can read and draft while still asking for your approval before it sends, posts, archives, or deletes. That makes broad access low-risk early on: Connect generously so Codex can find workflows worth building. Then, once you know which ones you’ll keep, disconnect the tools you don’t need to reduce risk and limit unnecessary data exposure.&lt;/p&gt;&lt;h4 data-guide-block-kind="callout" data-guide-block-id="guide-block-1779818908797-042bpz"&gt;&lt;strong&gt;Three ways Codex reaches your tools&lt;/strong&gt;&lt;/h4&gt;&lt;p data-guide-block-kind="callout" data-guide-block-id="guide-block-1779818908797-042bpz"&gt;Codex can touch the same tool in more than one way, and knowing which access path is which saves a lot of confusion:&lt;/p&gt;&lt;ul&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1779818908797-042bpz"&gt;&lt;strong&gt;Connectors (plugins)&lt;/strong&gt; give Codex structured, API-level access to an app—Gmail, Slack, Notion, your analytics tools. This is the most reliable and repeatable option, so use it whenever a connector exists.&lt;/li&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1779818908797-042bpz"&gt;&lt;strong&gt;Browser use&lt;/strong&gt; lets Codex operate a web page directly through its in-app browser—useful for local previews, public pages, and anything you want to watch it do on a shared screen. For sites that require you to be signed in, like your email client, the Codex Chrome extension works inside your logged-in browser.&lt;/li&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1779818908797-042bpz"&gt;&lt;strong&gt;Computer use&lt;/strong&gt; lets Codex see and operate your desktop the way a person would—clicking through an app, changing a setting, or working with software that only exists as a graphical interface.&lt;/li&gt;&lt;/ul&gt;&lt;p data-guide-block-kind="callout" data-guide-block-id="guide-block-1779818908797-042bpz"&gt;The rule of thumb: Reach for a connector first, the browser next, and computer use when nothing else can get to the task.&lt;/p&gt;&lt;p&gt;Starting prompt—use this once your integrations are set up:&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779817720669-ddieaq"&gt;Connect to the tools I use for work: [List your tools—Gmail, Slack, Notion, Drive, etc.]. Then look at my work patterns across those tools and suggest three workflows I should set up first. For each one, describe the input sources, the output artifact, how often it should run, what approval looks like, and what would make the workflow worth keeping long-term.&lt;/p&gt;&lt;p&gt;Once the relevant tools are connected and permissioned, this prompt lets Codex inspect the available work context and suggest automation candidates rather than forcing you to invent them.&lt;/p&gt;&lt;h3&gt;&lt;strong&gt;Build your Codex workspace&lt;/strong&gt;&lt;/h3&gt;&lt;p&gt;Build Codex’s workspace before running any workflows. Skip this step and you’ll likely stall. &lt;/p&gt;&lt;p&gt;A Codex workspace is a folder—local on your machine, synced to GitHub if you want version control—that contains the context files, workflow instructions, and review standards Codex reads at the start of each session. Think of it as an onboarding manual the agent reads at the start of each session.&lt;/p&gt;&lt;h4&gt;&lt;strong&gt;An example workspace structure&lt;/strong&gt;&lt;/h4&gt;&lt;p data-guide-block-kind="terminal" data-guide-block-id="guide-block-1779817905236-7c2762"&gt;your-workspace/&lt;/p&gt;&lt;p data-guide-block-id="guide-block-1779817905236-7c2762" data-guide-block-kind="terminal"&gt;├── README.md                  # Start here—orientation&lt;/p&gt;&lt;p data-guide-block-kind="terminal" data-guide-block-id="guide-block-1779817905236-7c2762"&gt;├── identity/                  # About you&lt;/p&gt;&lt;p data-guide-block-kind="terminal" data-guide-block-id="guide-block-1779817905236-7c2762"&gt;│   ├── context.md&lt;/p&gt;&lt;p data-guide-block-kind="terminal" data-guide-block-id="guide-block-1779817905236-7c2762"&gt;│   ├── preferences.md&lt;/p&gt;&lt;p data-guide-block-kind="terminal" data-guide-block-id="guide-block-1779817905236-7c2762"&gt;│   └── rules.md&lt;/p&gt;&lt;p data-guide-block-kind="terminal" data-guide-block-id="guide-block-1779817905236-7c2762"&gt;├── playbooks/                 # Process—repeatable workflows&lt;/p&gt;&lt;p data-guide-block-kind="terminal" data-guide-block-id="guide-block-1779817905236-7c2762"&gt;│   ├── workflows/&lt;/p&gt;&lt;p data-guide-block-kind="terminal" data-guide-block-id="guide-block-1779817905236-7c2762"&gt;│   ├── inbox-sweep.md&lt;/p&gt;&lt;p data-guide-block-kind="terminal" data-guide-block-id="guide-block-1779817905236-7c2762"&gt;│   └── research-brief.md&lt;/p&gt;&lt;p data-guide-block-kind="terminal" data-guide-block-id="guide-block-1779817905236-7c2762"&gt;├── sources/                   # Source shelf—inputs&lt;/p&gt;&lt;p data-guide-block-kind="terminal" data-guide-block-id="guide-block-1779817905236-7c2762"&gt;│   ├── sources/&lt;/p&gt;&lt;p data-guide-block-kind="terminal" data-guide-block-id="guide-block-1779817905236-7c2762"&gt;│   ├── key-links.md&lt;/p&gt;&lt;p data-guide-block-kind="terminal" data-guide-block-id="guide-block-1779817905236-7c2762"&gt;│   └── recurring-docs.md&lt;/p&gt;&lt;p data-guide-block-kind="terminal" data-guide-block-id="guide-block-1779817905236-7c2762"&gt;├── outputs/                   # Finished work&lt;/p&gt;&lt;p data-guide-block-kind="terminal" data-guide-block-id="guide-block-1779817905236-7c2762"&gt;│   ├── outputs/&lt;/p&gt;&lt;p data-guide-block-kind="terminal" data-guide-block-id="guide-block-1779817905236-7c2762"&gt;│   ├── drafts/&lt;/p&gt;&lt;p data-guide-block-kind="terminal" data-guide-block-id="guide-block-1779817905236-7c2762"&gt;│   └── reports/&lt;/p&gt;&lt;p data-guide-block-kind="terminal" data-guide-block-id="guide-block-1779817905236-7c2762"&gt;└── reviews/                   # Quality checks—guardrails&lt;/p&gt;&lt;p data-guide-block-kind="terminal" data-guide-block-id="guide-block-1779817905236-7c2762"&gt;    ├── data-checklist.md&lt;/p&gt;&lt;p data-guide-block-kind="terminal" data-guide-block-id="guide-block-1779817905236-7c2762"&gt;    └── writing-checklist.md&lt;/p&gt;&lt;p&gt;What you’re doing here has a name: context engineering—a term popularized by Shopify CEO &lt;strong&gt;&lt;u&gt;&lt;a href="https://x.com/tobi/status/1935533422589399127" rel="noopener noreferrer" target="_blank"&gt;Toby Lütke&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; and prominent AI engineer &lt;strong&gt;&lt;u&gt;&lt;a href="https://x.com/karpathy/status/1937902205765607626?lang=en" rel="noopener noreferrer" target="_blank"&gt;Andrej Karpathy&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;. Getting the right context to the model at the right time accounts for at least half of its performance. &lt;/p&gt;&lt;p&gt;At the start of each session, Codex looks at &lt;code&gt;AGENTS.md&lt;/code&gt;, which works as the table of contents. You can write your standing instructions directly in it, but we recommend keeping &lt;code&gt;AGENTS.md&lt;/code&gt; short and pointing it at more detailed files: &lt;code&gt;context.md&lt;/code&gt; for who you are and what you’re working on, &lt;code&gt;preferences.md&lt;/code&gt; for how you want the work done, and &lt;code&gt;rules.md&lt;/code&gt; for what it may and may not do without asking.&lt;/p&gt;&lt;h4 data-guide-block-kind="callout" data-guide-block-id="guide-block-1779818686951-lpvg9i"&gt;&lt;strong&gt;What to put in your context files&lt;/strong&gt;&lt;/h4&gt;&lt;p data-guide-block-kind="callout" data-guide-block-id="guide-block-1779818686951-lpvg9i"&gt;&lt;code&gt;context.md&lt;/code&gt; should cover:&lt;/p&gt;&lt;ul&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1779818686951-lpvg9i"&gt;Your role and the function you own&lt;/li&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1779818686951-lpvg9i"&gt;Active projects and their current status&lt;/li&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1779818686951-lpvg9i"&gt;The tools you use daily and what each one is for&lt;/li&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1779818686951-lpvg9i"&gt;The people or teams you work with most closely&lt;/li&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1779818686951-lpvg9i"&gt;How decisions typically get made in your context&lt;/li&gt;&lt;/ul&gt;&lt;p data-guide-block-kind="callout" data-guide-block-id="guide-block-1779818686951-lpvg9i"&gt;&lt;code&gt;preferences.md&lt;/code&gt; should cover:&lt;/p&gt;&lt;ul&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1779818686951-lpvg9i"&gt;Writing style and tone (formal or conversational, terse or thorough)&lt;/li&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1779818686951-lpvg9i"&gt;Communication preferences (what you like to review before it goes out and what can be drafted and queued without your involvement)&lt;/li&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1779818686951-lpvg9i"&gt;Decision-making preferences (when to ask before acting and when to proceed and report back)&lt;/li&gt;&lt;/ul&gt;&lt;p data-guide-block-kind="callout" data-guide-block-id="guide-block-1779818686951-lpvg9i"&gt;&lt;code&gt;rules.md&lt;/code&gt; should cover:&lt;/p&gt;&lt;ul&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1779818686951-lpvg9i"&gt;What Codex may never do without explicit approval: Send, post, archive, delete, modify a source of truth, or move money&lt;/li&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1779818686951-lpvg9i"&gt;What Codex may do without asking: Draft, summarize, research, outline, organize&lt;/li&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1779818686951-lpvg9i"&gt;Any standing constraints specific to your work (e.g., client confidentiality rules, brand standards, data handling requirements)&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;Starting prompt—use this to have Codex create your workspace structure:&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779818637346-s57oa8"&gt;[First: Create a folder on your desktop called “Codex”] &lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779818637346-s57oa8"&gt;Set up this folder as a simple Codex workspace for knowledge work.&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779818637346-s57oa8"&gt;Create three starter files:&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779818637346-s57oa8"&gt;    1. context.md—who I am, what I’m working on, what tools I use, and who I work with&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779818637346-s57oa8"&gt;    2. preferences.md—how I like work to be written, reviewed, and handled&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779818637346-s57oa8"&gt;    3. rules.md—what you may do without asking, what you must ask before doing, and what you must never do&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779818637346-s57oa8"&gt;Interview me one question at a time to gather the information you need to fill in each file.&lt;/p&gt;&lt;h4&gt;&lt;strong&gt;The “one pinned chat per project” rule &lt;/strong&gt;&lt;/h4&gt;&lt;p&gt;The workspace folder is for your context; pinned chats are for your work. You can find the option to pin a chat next to the chat name in the app’s lefthand navigation bar. A useful habit from day one is to keep one persistent, pinned thread per project or area of responsibility—one for the product launch, one for weekly reporting, one for recruiting—rather than spinning up a fresh chat for every request. A standing thread accumulates context as you go, so Codex remembers what you have already established and you don’t have to re-explain the project each time. A pinned chat with a goal and the thread itself turns Codex into a reliable home for that stream of work.&lt;/p&gt;&lt;h2&gt;&lt;strong&gt;Part 3: The five levels of Codex use&lt;/strong&gt;&lt;/h2&gt;&lt;p&gt;Codex power users don’t arrive there all at once. They get there in stages, and each stage calls for a different way of thinking about what Codex is doing and what it’s good for. Skip ahead too quickly, and you’ll get frustrated —either you don’t trust it yet, or you haven’t built the infrastructure for more autonomous work. At every level, you should know when to hand work to Codex and when to stay in the loop as its collaborator.&lt;/p&gt;&lt;h3&gt;&lt;strong&gt;Level 1: One-off knowledge work&lt;/strong&gt;&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Mental model:&lt;/strong&gt; Codex as a capable, thorough research and drafting assistant.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Mode: &lt;/strong&gt;Collaborate.&lt;strong&gt; &lt;/strong&gt;At this level, nothing is automated. You run single-session tasks, review everything before it leaves your hands, and build familiarity with how Codex handles different types of work.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Best first tasks:&lt;/strong&gt;&lt;/p&gt;&lt;ul&gt;&lt;li&gt;Summarize a meeting transcript and extract decisions, open questions, and follow-up actions.&lt;/li&gt;&lt;li&gt;Turn scattered notes into a structured outline.&lt;/li&gt;&lt;li&gt;Build a research brief from a set of links and documents.&lt;/li&gt;&lt;li&gt;Rewrite a draft against a style guide.&lt;/li&gt;&lt;li&gt;Create a review checklist for a document, launch plan, or strategy memo.&lt;/li&gt;&lt;li&gt;Convert a written draft into an audio file for editing on the go.&lt;/li&gt;&lt;/ul&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779818837556-dazs6l"&gt;Use the attached [documents/links/notes] to produce [specific artifact]. Prioritize accuracy over elegance. Include source links for any factual claims. Flag anything uncertain or that requires my verification. End with the three questions I should answer before this artifact is ready to use.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Review habit:&lt;/strong&gt; Before polishing any output, ask Codex to list the assumptions it made and where it is least confident. This surfaces problems before you invest time in refinement.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Move to Level 2 when:&lt;/strong&gt; You keep wishing Codex remembered what you told it last time.&lt;/p&gt;&lt;h3&gt;&lt;strong&gt;Level 2: Multi-source workflows&lt;/strong&gt;&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Mental model:&lt;/strong&gt; Codex as a cross-system analyst that can assemble information you could never pull together manually in a reasonable amount of time.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Mode: &lt;/strong&gt;Collaborate. At this level, Codex can synthesize outputs from multiple connected systems—Slack threads, Notion pages, email archives, analytics dashboards, and Google Drive documents—but it still needs guidance and feedback.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Example multi-source tasks:&lt;/strong&gt;&lt;/p&gt;&lt;ul&gt;&lt;li&gt;A go-to-market plan built from internal meeting transcripts, Slack discussions, customer notes, and a strategy template&lt;/li&gt;&lt;li&gt;A weekly KPI report from analytics, revenue data, support volume, and social metrics&lt;/li&gt;&lt;li&gt;A summary synthesized from Slack, Notion, Drive links, and past drafts&lt;/li&gt;&lt;li&gt;A weekly leadership brief assembled from team standups, metrics, and open decisions&lt;/li&gt;&lt;/ul&gt;&lt;p data-guide-block-label="How to delegate a multi-source task" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779819023849-bi5kjp"&gt;I need [specific artifact].&lt;/p&gt;&lt;p data-guide-block-label="How to delegate a multi-source task" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779819023849-bi5kjp"&gt;Sources to use:&lt;/p&gt;&lt;p data-guide-block-label="How to delegate a multi-source task" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779819023849-bi5kjp"&gt; - [Tool 1]: [what to look for there]&lt;/p&gt;&lt;p data-guide-block-label="How to delegate a multi-source task" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779819023849-bi5kjp"&gt; - [Tool 2]: [what to look for there]&lt;/p&gt;&lt;p data-guide-block-label="How to delegate a multi-source task" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779819023849-bi5kjp"&gt; - [Tool 3]: [what to look for there]&lt;/p&gt;&lt;p data-guide-block-label="How to delegate a multi-source task" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779819023849-bi5kjp"&gt;Output format: [describe the structure you want]&lt;/p&gt;&lt;p data-guide-block-label="How to delegate a multi-source task" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779819023849-bi5kjp"&gt;Before you start, give me a short plan: Identify the sources you will inspect, the artifact you will produce, any gaps or unknowns you anticipate, and the checks you will run before calling it done. If anything requires sending, posting, archiving, or modifying a source of truth, ask first.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;A warning about data:&lt;/strong&gt; A one-shot attempt at pulling data from multiple systems can be wrong because of stale data, mismatched definitions, permissions gaps, or join errors. For any metric that informs business decisions or agent actions, verify column by column against your primary source. The closer a number is to a source of truth, the more carefully it needs to be checked.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Make your outputs agent-readable:&lt;/strong&gt; Plans and reports you generate in Codex will be read by other people—but also, increasingly, by their agents. Write them in plain, structured language that a human can scan and an agent can query. Clear section headers, explicit decisions, and labeled action items make the artifact useful in both directions.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Move to Level 3 when:&lt;/strong&gt; You keep running the same multi-source workflow more than once a week and wishing it happened automatically.&lt;/p&gt;&lt;h3&gt;&lt;strong&gt;Level 3: Repeated chores into persistent workflows &lt;/strong&gt;&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Mental model:&lt;/strong&gt; Codex as an automated operations layer that handles predictable, recurring work so you don’t have to.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Mode: &lt;/strong&gt;Hybrid. Some tasks are fully predictable and can run without back-and-forth. These tasks are ripe for &lt;strong&gt;delegation&lt;/strong&gt;. Tasks that involve judgment, strategy, or creative decisions suit &lt;strong&gt;collaboration.&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;A useful heuristic: If you could write a checklist that covers 90 percent of the cases, delegate it. If you would need to think about it differently each time, collaborate.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;In either case, look for “computer chores”&lt;/strong&gt;—recurring tasks that take time and attention, but don’t require human judgment at every single touchpoint.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Common chore candidates:&lt;/strong&gt;&lt;/p&gt;&lt;ul&gt;&lt;li&gt;End-of-day check for unanswered Slack messages and emails, with drafted replies&lt;/li&gt;&lt;li&gt;Weekly metrics brief from analytics, revenue, and support data&lt;/li&gt;&lt;li&gt;Meeting-note cleanup and action-item extraction after each recorded call&lt;/li&gt;&lt;li&gt;Customer support pattern detection and issue routing&lt;/li&gt;&lt;li&gt;Draft-to-review package that formats a piece for editor handoff&lt;/li&gt;&lt;li&gt;Recruiting research for an open role&lt;/li&gt;&lt;/ul&gt;&lt;p data-guide-block-label="Workflow template" data-guide-block-kind="template" data-guide-block-id="guide-block-1779819137935-8no8us"&gt;Before building any persistent workflow, fill out this template. It becomes the instruction file Codex reads every time the workflow runs. (The workflows in Part 4 are each an example of this canvas applied.)&lt;/p&gt;&lt;p data-guide-block-label="Workflow template" data-guide-block-kind="template" data-guide-block-id="guide-block-1779819137935-8no8us"&gt;Workflow name:&lt;/p&gt;&lt;p data-guide-block-label="Workflow template" data-guide-block-kind="template" data-guide-block-id="guide-block-1779819137935-8no8us"&gt;Trigger or cadence:&lt;/p&gt;&lt;p data-guide-block-label="Workflow template" data-guide-block-kind="template" data-guide-block-id="guide-block-1779819137935-8no8us"&gt;Input sources:&lt;/p&gt;&lt;p data-guide-block-label="Workflow template" data-guide-block-kind="template" data-guide-block-id="guide-block-1779819137935-8no8us"&gt;Output artifact:&lt;/p&gt;&lt;p data-guide-block-label="Workflow template" data-guide-block-kind="template" data-guide-block-id="guide-block-1779819137935-8no8us"&gt;Approval rules:&lt;/p&gt;&lt;p data-guide-block-label="Workflow template" data-guide-block-kind="template" data-guide-block-id="guide-block-1779819137935-8no8us"&gt;What Codex may do without asking:&lt;/p&gt;&lt;p data-guide-block-label="Workflow template" data-guide-block-kind="template" data-guide-block-id="guide-block-1779819137935-8no8us"&gt;What Codex must ask before doing:&lt;/p&gt;&lt;p data-guide-block-label="Workflow template" data-guide-block-kind="template" data-guide-block-id="guide-block-1779819137935-8no8us"&gt;Verification steps:&lt;/p&gt;&lt;p data-guide-block-label="Workflow template" data-guide-block-kind="template" data-guide-block-id="guide-block-1779819137935-8no8us"&gt;Where the final output lives:&lt;/p&gt;&lt;p data-guide-block-label="Workflow template" data-guide-block-kind="template" data-guide-block-id="guide-block-1779819137935-8no8us"&gt;When to retire or revise this workflow:&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Review discipline for automated workflows:&lt;/strong&gt; Don’t review automated output inside Codex. Draft in Codex, then review in the destination app—Slack for Slack messages, Gmail for email drafts, word processors for documents. Content that looks fine in a terminal often reads differently in the tool where it’s ultimately used, and the context switch catches things a Codex review pass would miss.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Move to Level 4 when:&lt;/strong&gt; Your prompt-based workflow hits a ceiling—the task is too complex or too custom to handle in text alone, and a small script or local tool would make it reliable.&lt;/p&gt;&lt;h3&gt;&lt;strong&gt;Level 4: Build small tools when prompts are not enough&lt;/strong&gt;&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Mental model:&lt;/strong&gt; Codex as a builder that creates lightweight infrastructure to make your workflows more reliable, faster, or more repeatable.&lt;/p&gt;&lt;p&gt;Sometimes the best Codex output is a small script, a local app, a custom dashboard, or a review surface that makes a recurring workflow easier, rather than pure text.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Mode: &lt;/strong&gt;Hybrid. In some cases, Codex may generate an artifact independently for you to review and then move on. In others, the artifact it produces may become a space where you and the agent iterate together.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Examples of when a small tool helps:&lt;/strong&gt;&lt;/p&gt;&lt;ul&gt;&lt;li&gt;A recurring workflow that requires pulling from an API that has no Codex integration. A short script handles the connection reliably.&lt;/li&gt;&lt;li&gt;A review process where you need to see formatted output side by side with the source. A simple local app gives you the view.&lt;/li&gt;&lt;li&gt;A task that needs to run on a schedule without your involvement. A script set to run on a timer (a cron job) handles the timing.&lt;/li&gt;&lt;li&gt;A workflow that accumulates structured data over time. A lightweight database or structured file tracks it persistently.&lt;/li&gt;&lt;/ul&gt;&lt;p data-guide-block-kind="callout" data-guide-block-id="guide-block-1779820329598-tdqexj"&gt;&lt;strong&gt;Practical approach for non-engineers:&lt;/strong&gt;&lt;/p&gt;&lt;ol&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1779820329598-tdqexj"&gt;Run the task manually in Codex once to confirm the output is what you want&lt;/li&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1779820329598-tdqexj"&gt;Ask Codex: “Which steps in this workflow could be made more reliable with a small script or tool?”&lt;/li&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1779820329598-tdqexj"&gt;Have Codex prototype the tool and explain what it does in plain language&lt;/li&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1779820329598-tdqexj"&gt;Run it on your data and verify the output matches what the manual process produced&lt;/li&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1779820329598-tdqexj"&gt;Keep only the parts that reduce friction. Discard what adds complexity without benefit.&lt;/li&gt;&lt;/ol&gt;&lt;p&gt;You don’t need to understand every line of code to use a tool Codex built. You do need to understand what data it touches, what it produces, and where the review step is. If you can’t explain those three things, the tool isn’t ready to run autonomously.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Move to level 5 when: &lt;/strong&gt;You give Codex the same feedback repeatedly and have standing preferences that you’d prefer it to apply on its own.&lt;/p&gt;&lt;h3&gt;&lt;strong&gt;Level 5: Compound your Codex system&lt;/strong&gt;&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Mental model:&lt;/strong&gt; Codex as a system that can improve over time when you save useful workflows, maintain review rules, and use memories or skills to codify preferences where available.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Mode: &lt;/strong&gt;Hybrid. Some instructions will dictate how the agent approaches autonomous work; others will guide how the model interacts with you in collaboration mode.&lt;/p&gt;&lt;p&gt;The idea of “compounding” work comes from &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/guides/compound-engineering" rel="noopener noreferrer" target="_blank"&gt;compound engineering&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;, the AI-native coding methodology coined by &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@kieran_1355" rel="noopener noreferrer" target="_blank"&gt;Kieran Klaassen&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; and &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@nityesh" rel="noopener noreferrer" target="_blank"&gt;Nityesh Agarwal&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; while building &lt;strong&gt;&lt;u&gt;&lt;a href="http://cora.computer" rel="noopener noreferrer" target="_blank"&gt;Cora&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;, Every’s email client. The canonical example is a product requirements document (PRD) that writes the scaffolding for the next one: The artifact you produce becomes the tool that speeds up the next round. The four habits below are how you put it into practice as a knowledge worker, not just an engineer.&lt;/p&gt;&lt;p&gt;Remember: &lt;strong&gt;Each useful session should make future sessions faster and more reliable. &lt;/strong&gt;In practice, that requires doing four things consistently after completing any significant piece of work:&lt;/p&gt;&lt;p&gt;&lt;strong&gt;1. Save successful prompts as workflow files.&lt;/strong&gt; When a prompt produces exactly the right output, document it. Write down the input sources, the exact prompt, the output format, and the review step. Save it in your workflows/ folder. The next time you need the same output, the agent will have that reference to work from.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;2. Add mistakes to review checklists.&lt;/strong&gt; When Codex gets something wrong—a number that was off, a tone that missed the mark, or an assumption it should not have made—add a specific check to your relevant review file, and instruct Codex to check its work against those guardrails.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;3. Update your context files after major projects.&lt;/strong&gt; When a project ends, update context.md to reflect what changed—new priorities, new tools, what worked, and what didn’t. Codex can use this when you point it to the file, turn it into a skill/workflow, or store the pattern in Codex memory where available.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;4. Ask Codex to identify compounding opportunities.&lt;/strong&gt; At the end of any session where you did something useful, run this prompt:&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820367058-c271lt"&gt;Based on what we just did, what parts of this workflow should become a reusable skill, an automation, or a small tool? What context should I add to my project files so we don’t have to re-establish this next time?&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Forking for your discipline:&lt;/strong&gt; The &lt;u&gt;&lt;a href="https://github.com/EveryInc/compound-engineering-plugin" rel="noopener noreferrer" target="_blank"&gt;compound engineering plugin&lt;/a&gt;&lt;/u&gt;, Every’s open-source system for structured agent workflows, installable in Codex with one command, works for knowledge work out of the box, but its review agents are optimized for coding needs like establishing frontend patterns and reviewing for code performance.&lt;/p&gt;&lt;p&gt;Knowledge workers can fork it into a version with reviewers tuned for strategic alignment, data accuracy, writing quality, and communication standards. A forked version, &lt;u&gt;&lt;a href="https://github.com/EveryInc/compound-knowledge-plugin" rel="noopener noreferrer" target="_blank"&gt;compound knowledge&lt;/a&gt;&lt;/u&gt;, is publicly available on Every’s GitHub, and is designed to be readable and editable by non-engineers.&lt;/p&gt;&lt;h2&gt;&lt;strong&gt;Part 4: Workflow library&lt;/strong&gt;&lt;/h2&gt;&lt;p&gt;These workflows are meant as inspiration to get you started. Adapt the inputs, outputs, and approval rules to your specific tools and standards.&lt;/p&gt;&lt;h3&gt;&lt;strong&gt;1. Inbox zero review queue&lt;/strong&gt;&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Best for:&lt;/strong&gt; Anyone whose email backlog is a recurring source of anxiety or dropped balls.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Input sources:&lt;/strong&gt; Gmail or your email client of choice.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Output artifact:&lt;/strong&gt; A structured list of draft replies, proposed actions (archive, delegate, flag), and any emails flagged for your personal attention because the draft alone isn’t sufficient.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@danshipper" rel="noopener noreferrer" target="_blank"&gt;Dan Shipper&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; kept inbox zero for 10 days straight with Codex. To use this workflow, have Codex: &lt;/p&gt;&lt;ol&gt;&lt;li&gt;Gather email through &lt;u&gt;&lt;a href="https://cora.computer/" rel="noopener noreferrer" target="_blank"&gt;Cora&lt;/a&gt;&lt;/u&gt; running in the in-app browser.&lt;/li&gt;&lt;li&gt;Render the email queue as a single page.&lt;/li&gt;&lt;li&gt;Go through each item with you as you dictate the action the AI should take (e.g., “research this,” “draft that,” “pull the documents our lawyers asked for.”) You can do this via chat or voice with a dictation tool like &lt;strong&gt;&lt;u&gt;&lt;a href="https://www.monologue.to/" rel="noopener noreferrer" target="_blank"&gt;Monologue&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; (we recommend the latter). &lt;/li&gt;&lt;/ol&gt;&lt;p&gt;&lt;strong&gt;First prompt:&lt;/strong&gt;&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820476286-abtpfb"&gt;	Go through my inbox for the past [time period].&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820476286-abtpfb"&gt;	For each email that needs a response or action:&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820476286-abtpfb"&gt;1. Categorize it: needs reply/needs action/can archive/already handled&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820476286-abtpfb"&gt;2. If it needs a reply, draft one in my voice using the style in &lt;code&gt;preferences.md&lt;/code&gt;&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820476286-abtpfb"&gt;3. If it needs action, describe the action clearly&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820476286-abtpfb"&gt;4. Flag any email where a draft reply isn’t enough—where I need to think about this personally before responding&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820476286-abtpfb"&gt;	Don’t send anything. Create drafts only. I will review in Gmail.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Review step:&lt;/strong&gt; Review all drafts in Gmail before sending. Don’t approve from inside Codex.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;How to compound:&lt;/strong&gt; After a few sessions, add a rule file describing your categorization preferences—which senders always get priority, which topics can be archived without reply, and which types of requests need a human-written response. &lt;/p&gt;&lt;h3&gt;&lt;strong&gt;2. Daily unanswered message roundup&lt;/strong&gt;&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Best for:&lt;/strong&gt; Anyone who communicates across Slack, email, and other channels and loses track of what still needs a response.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Input sources:&lt;/strong&gt; Slack, Gmail, any other communication tool you use.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Output artifact:&lt;/strong&gt; A list of unanswered items with drafted replies or proposed reactions, organized by urgency.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;First prompt:&lt;/strong&gt;&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820496080-1lm39w"&gt;Look across my Slack and Gmail for the past 24 hours. Find everything that was directed at me that I have not responded to.&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820496080-1lm39w"&gt;	For each item:&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820496080-1lm39w"&gt;1. Draft a reply or suggest a reaction (thumbs up, etc.) if a short acknowledgment is appropriate&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820496080-1lm39w"&gt;2. Flag items where a more considered response is needed3. Flag anything time-sensitive&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820496080-1lm39w"&gt;	Present the list organized by urgency. Don’t send anything.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Review step:&lt;/strong&gt; Review in Slack and Gmail.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;How to compound:&lt;/strong&gt; After a few runs, save a rules file specifying which Slack channels are high-priority, which senders always warrant a human response, and which types of messages can be handled with a reaction rather than a reply.&lt;/p&gt;&lt;h3&gt;&lt;strong&gt;3. Research brief creation&lt;/strong&gt;&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Best for:&lt;/strong&gt; Anyone preparing for a meeting, a pitch, a content piece, or a strategic decision and needing a thorough, sourced summary of a topic.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Input sources:&lt;/strong&gt; Provided links, Notion, Drive, web search.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Output artifact:&lt;/strong&gt; A structured brief with background, key facts, open questions, and source links.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;First prompt:&lt;/strong&gt;&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820511227-dhvhjc"&gt;	Build a research brief on [topic]. &lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820511227-dhvhjc"&gt;	Sources to prioritize: [List any specific links, documents, or databases].&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820511227-dhvhjc"&gt;	Structure the brief as:&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820511227-dhvhjc"&gt;	- Background: what I need to know to have a smart conversation about this&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820511227-dhvhjc"&gt;	- Key facts and data points, each with a source link&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820511227-dhvhjc"&gt;	- Competing perspectives or significant disagreements in the field&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820511227-dhvhjc"&gt;	- Open questions I should be able to answer before [meeting/decision/deadline]&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820511227-dhvhjc"&gt;	- Three things I should read next if I want to go deeper&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820511227-dhvhjc"&gt;Flag any claims you are less than confident about.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Review step:&lt;/strong&gt; Check source links. Verify any statistics against the original source before using them.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;How to compound:&lt;/strong&gt; Save a brief template in your workflows/ folder. After each brief, add any recurring sources (newsletters, databases, key authors) to your sources/key-links.md so Codex checks them by default.&lt;/p&gt;&lt;h3&gt;&lt;strong&gt;4. Writing with a parallel review loop&lt;/strong&gt;&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Best for:&lt;/strong&gt; Writers who want Codex running alongside them as they draft—checking the work, flagging issues, and responding in parallel without interrupting the writing session.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Input sources:&lt;/strong&gt; Your draft (open in your word processor through Codex’s in-app browser), any relevant style guides, source documents, or review standards in your workspace.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Output artifact:&lt;/strong&gt; An annotated draft with inline feedback, flagged issues, and suggested revisions—produced continuously as you write rather than in a single pass at the end.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Setup: &lt;/strong&gt;Open your draft in Proof or the in-app browser. Start a Codex session with your workspace context loaded. Give Codex standing instructions for what to monitor and how to respond.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;First prompt:&lt;/strong&gt;&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820529028-05kfk1"&gt;	I am writing [describe the piece—type, audience, purpose].&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820529028-05kfk1"&gt;As I draft, run a continuous review loop. Check for:&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820529028-05kfk1"&gt;- Claims that need a source or are stated with more confidence than the evidence supports&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820529028-05kfk1"&gt;- Passages where the argument loses clarity or the logic has a gap&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820529028-05kfk1"&gt;- Sentences that violate the style preferences in &lt;u&gt;&lt;a href="http://preferences.md" rel="noopener noreferrer" target="_blank"&gt;preferences.md&lt;/a&gt;&lt;/u&gt;&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820529028-05kfk1"&gt;- Anything that reads as filler, throat-clearing, or AI-generated phrasing&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820529028-05kfk1"&gt;Don’t rewrite anything without being asked. Flag issues as I go with a brief note on what the problem is and what would fix it. Check in every [X minutes / X paragraphs] or when I ask.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Review step:&lt;/strong&gt; Read the flagged issues at natural stopping points—the end of a section or session. Decide which to address and which to dismiss. Don’t let the feedback loop interrupt the drafting flow; the value is in the accumulation, not in responding to every flag in real time.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;How to compound:&lt;/strong&gt; After each writing session, add any recurring flags to your reviews/writing-checklist.md. Patterns that come up repeatedly are candidates for a standing rule in your preferences file, so Codex catches them automatically next time.&lt;/p&gt;&lt;h3&gt;&lt;strong&gt;5. Source management for research&lt;/strong&gt;&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Best for:&lt;/strong&gt; Writers and researchers who need to organize source material before drafting.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Input sources:&lt;/strong&gt; Links, PDFs, past drafts, notes, transcripts.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Output artifact:&lt;/strong&gt; A structured document with the core argument, supporting evidence organized by claim, counterarguments, and a gap analysis (what is still missing).&lt;/p&gt;&lt;p&gt;&lt;strong&gt;First prompt:&lt;/strong&gt;&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820559583-q3ha4f"&gt;	I am writing a piece on [topic]. The core argument I want to make is [argument].&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820559583-q3ha4f"&gt;	Here are my source materials: [links/documents].&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820559583-q3ha4f"&gt;	Build an evidence room that:&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820559583-q3ha4f"&gt;1. States the core argument clearly&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820559583-q3ha4f"&gt;	2. Lists the strongest supporting evidence for each main point, with source links&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820559583-q3ha4f"&gt;	3. Lists the strongest counterarguments and how I might address them&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820559583-q3ha4f"&gt;	4. Identifies any gaps—claims I am making that lack strong evidence&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820559583-q3ha4f"&gt;	5. Flags any sources that conflict with each other&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Review step:&lt;/strong&gt; Read the evidence room before drafting. Verify any statistics or quotes you plan to use directly.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;How to compound:&lt;/strong&gt; Save the evidence format as a workflow template. Add a standing note to your context file about your writing voice and recurring themes so Codex calibrates its framing.&lt;/p&gt;&lt;h3&gt;&lt;strong&gt;6. Information via audio&lt;/strong&gt;&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Best for:&lt;/strong&gt; Anyone who processes information better by listening than reading, or who wants to take time away from a screen but stay on top of work.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Input sources:&lt;/strong&gt; Any written content: drafts, research briefs, meeting summaries, strategy documents, reports, lengthy emails, articles.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Output artifact:&lt;/strong&gt; An audio file saved to a location accessible from your phone (Dropbox, Drive, etc.).&lt;/p&gt;&lt;p&gt;&lt;strong&gt;First prompt:&lt;/strong&gt;&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820573319-wgfep4"&gt;Convert the attached [document/draft/report] into a clear audio file. Read it at a natural pace—not rushed, not slow. Save it to [Dropbox/Drive location] as [filename].&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820573319-wgfep4"&gt;&lt;strong&gt;Review step:&lt;/strong&gt; Listen on your commute, walk, or wherever you have time away from a screen. Take notes on your phone as things come up. Return to the source material with whatever you noticed.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;How to compound:&lt;/strong&gt; Add a standing instruction to your context file covering your audio preferences—such as speed, file format, naming convention, and preferred save location—so you do not have to specify each time. You can also prompt Codex to convert content automatically at the end of certain workflows: “After generating the weekly metrics report, convert it to audio and save to [location].”&lt;/p&gt;&lt;h3&gt;&lt;strong&gt;7. Go-to-market plan generator&lt;/strong&gt;&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Best for:&lt;/strong&gt; Anyone responsible for launching a product, feature, or initiative and who has done the thinking in meetings and Slack but has not had time to formalize it.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Input sources:&lt;/strong&gt; Meeting transcripts, Slack threads, customer notes, a preferred strategy template.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Output artifact:&lt;/strong&gt; A complete go-to-market plan, structured for human review and agent querying.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;First prompt:&lt;/strong&gt;&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820595232-4q50ag"&gt;	Build a go-to-market plan for [product/initiative]. &lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820595232-4q50ag"&gt;	Sources to pull from:&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820595232-4q50ag"&gt;	- Meeting transcripts: [Notion location or links]&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820595232-4q50ag"&gt;	- Slack discussions: [channels or search terms]&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820595232-4q50ag"&gt;	- Customer research: [document or location]&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820595232-4q50ag"&gt;	- Template to follow: [link or paste template]&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820595232-4q50ag"&gt;The plan should be readable by a human in five minutes and structured so that an agent can answer specific questions about it (e.g., “What is the target ICP?” “What is the launch timeline?”).&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820595232-4q50ag"&gt;Start with a compound engineering brainstorm step. Give me a draft in Proof or Notion. Flag anything in the plan you added that was not in the source material—I only want synthesis of what we have already decided, not new suggestions baked in.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Review step:&lt;/strong&gt; Review in Notion or Proof. Verify that every major claim traces to something in the source material. Anything the model added that was not in your sources should be flagged for your decision.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;How to compound:&lt;/strong&gt; Save the template and the prompt. After each launch, add a retrospective note to your context file about what the plan got right and wrong. Future plans will be calibrated by past ones.&lt;/p&gt;&lt;h3&gt;&lt;strong&gt;8. KPI report&lt;/strong&gt;&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Best for:&lt;/strong&gt; Anyone responsible for tracking metrics and needing a regular, reliable view across multiple data sources.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Input sources:&lt;/strong&gt; Analytics (PostHog, Mixpanel, Amplitude), revenue data (Stripe), support volume, social metrics, saved past reports.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Output artifact:&lt;/strong&gt; A one-page report covering headlines, usage metrics, system health, and follow-up items.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;First prompt:&lt;/strong&gt;&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820607666-pyt5wv"&gt;	Generate a product pulse report for [time period].&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820607666-pyt5wv"&gt;	Data sources:&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820607666-pyt5wv"&gt;	- Product analytics: [tool and what to pull]&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820607666-pyt5wv"&gt;	- Revenue: [tool and what to pull]&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820607666-pyt5wv"&gt;	- Support: [tool and what to pull]&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820607666-pyt5wv"&gt;	- Social: [tool and what to pull]&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820607666-pyt5wv"&gt;	Structure:&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820607666-pyt5wv"&gt;	1. Headlines (three to five bullets summarizing what matters most)&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820607666-pyt5wv"&gt;2. Usage (primary engagement metric, value-realization metric, conversions, deltas vs. prior period)&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820607666-pyt5wv"&gt;3. System health (error rates, latency, top error signatures)&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820607666-pyt5wv"&gt;	4. Follow-ups (one to five things worth investigating, specific enough to act on)&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820607666-pyt5wv"&gt;Flag any number that differs significantly from the prior report. If something is anomalous, investigate one level deeper before including it.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Review step:&lt;/strong&gt; Verify every number in the report against its source. Don’t use this report as a business source of truth until you have confirmed accuracy column by column. In practice, one-shot metrics pulls are often five to 10 percent off—a common result of definition mismatches and join errors across multi-source pulls.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;How to compound:&lt;/strong&gt; Save each report as a dated file in your outputs/reports/ folder. Over time, Codex can compare reports, identify trends, and flag when something has changed. The folder becomes the working memory of your product.&lt;/p&gt;&lt;h3&gt;&lt;strong&gt;9. Customer support for product work&lt;/strong&gt;&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Best for:&lt;/strong&gt; Teams where support patterns should feed into product decisions and small fixes.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Input sources:&lt;/strong&gt; Support platform (Intercom, Zendesk), issue tracker (Linear, GitHub Issues).&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Output artifact:&lt;/strong&gt; A deduplicated list of issues with suggested priority, plus small issues ready to hand off for fixes.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;First prompt:&lt;/strong&gt;&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820617261-k37uky"&gt;	Go through my support queue for the past [time period]. &lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820617261-k37uky"&gt;	For each support thread:&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820617261-k37uky"&gt;	1. Identify the underlying issue or request.&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820617261-k37uky"&gt;	2. Check whether a similar issue already exists in [Linear/GitHub Issues].&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820617261-k37uky"&gt;	3. If it does, link them. If it doesn’t, draft a new issue.&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820617261-k37uky"&gt;	4. Flag any issue that appears more than [threshold] times—these are priorities.&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820617261-k37uky"&gt;	5. For issues that appear straightforward to fix, note that they are candidates &lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820617261-k37uky"&gt;	   for direct implementation.&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820617261-k37uky"&gt;	Don’t create issues in the tracker yet. Give me the list to review first.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Review step:&lt;/strong&gt; Review the issue list before anything goes into the tracker. Confirm deduplication is accurate—support tickets often describe the same underlying problem in different words.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;How to compound:&lt;/strong&gt; After each session, add a note about recurring issue types so Codex can categorize faster next time. Build a persistent list of known issues so deduplication improves over time.&lt;/p&gt;&lt;h3&gt;&lt;strong&gt;10. Pull requests for non-engineers&lt;/strong&gt;&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Best for:&lt;/strong&gt; Anyone who needs to make a small, well-scoped change to a codebase—such as copy updates, configuration changes, or content edits—without deep engineering knowledge.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Input sources:&lt;/strong&gt; The relevant files or repository, and a clear description of the change.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Output artifact:&lt;/strong&gt; A pull request (PR) that is reviewer-friendly and doesn’t touch anything outside the intended scope.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;First prompt:&lt;/strong&gt;&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820628006-6q08r6"&gt;	I need to make the following change: [describe the change clearly].&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820628006-6q08r6"&gt;	Before making any changes:&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820628006-6q08r6"&gt;1. Show me which files are affected&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820628006-6q08r6"&gt;2. Confirm the scope of the change—nothing outside these files should be touched&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820628006-6q08r6"&gt;	3. Explain what you are going to do in plain language before doing it&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820628006-6q08r6"&gt;	After making the change:&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820628006-6q08r6"&gt;	1. Summarize what was changed and why&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820628006-6q08r6"&gt;	2. List every file that was touched&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820628006-6q08r6"&gt;	3. Explain how you verified the change is correct&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820628006-6q08r6"&gt;	4. Flag anything a reviewer should look at carefully&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820628006-6q08r6"&gt;	Make the smallest useful change. Don’t refactor or improve anything adjacent.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Review step:&lt;/strong&gt; Review the Codex preview before the PR is opened. Review the PR itself in GitHub or your code review tool. Ask a technical colleague to approve before merging if you are uncertain.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;How to compound:&lt;/strong&gt; Save a template of your preferred PR format. After each PR, add a note about anything that requires correction so future PRs avoid the same issue.&lt;/p&gt;&lt;h3&gt;&lt;strong&gt;11. Recruiting research&lt;/strong&gt;&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Best for:&lt;/strong&gt; Anyone doing outbound recruiting for a role with a specific background profile.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Input sources:&lt;/strong&gt; LinkedIn, Twitter/X, company websites, alumni databases, public professional networks.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Output artifact:&lt;/strong&gt; A list of candidates with background summaries and contact information or connection points.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;First prompt:&lt;/strong&gt;&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820638904-lqg7gg"&gt;	I am hiring for [role]. The ideal candidate has [background profile—experience, &lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820638904-lqg7gg"&gt;	prior companies, skills, career trajectory].&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820638904-lqg7gg"&gt;	Search for candidates who match this profile. For each candidate:&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820638904-lqg7gg"&gt;	1. Summarize their background in two to three sentences&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820638904-lqg7gg"&gt;	2. Note why they match the profile&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820638904-lqg7gg"&gt;	3. Identify any connection point (mutual connections, follows, shared affiliations)&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820638904-lqg7gg"&gt;	4. Provide a link to their public profile&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820638904-lqg7gg"&gt;	Return the top [number] candidates, ranked by how closely they match the profile.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Review step:&lt;/strong&gt; Review each candidate before any outreach. Verify that the background summaries are accurate by checking the linked profiles. Don’t send any outreach through Codex.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;How to compound:&lt;/strong&gt; Save the role profile as a template. After a successful hire, document what the actual background looked like versus the initial profile to calibrate future searches.&lt;/p&gt;&lt;h3&gt;&lt;strong&gt;12. Strategy and planning agent&lt;/strong&gt;&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Best for:&lt;/strong&gt; Leaders and operators who need to compress OKR planning, quarterly planning, or strategic reviews from days to hours.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Input sources:&lt;/strong&gt; Past planning documents, meeting transcripts, leadership context notes, relevant metrics.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Output artifact:&lt;/strong&gt; A draft plan or OKR set, structured for review and iteration.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;First prompt:&lt;/strong&gt;&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820656796-4vc62h"&gt;	I need to draft [quarterly plan / OKR set / strategic review] for [scope].&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820656796-4vc62h"&gt;	Pull from:&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820656796-4vc62h"&gt;	- Past plans: [location]&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820656796-4vc62h"&gt;	- Recent meeting transcripts: [location]&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820656796-4vc62h"&gt;	- Current metrics: [location or description]&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820656796-4vc62h"&gt;	- Leadership context: [document or description]&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820656796-4vc62h"&gt;	Structure the output as [desired format].&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820656796-4vc62h"&gt;Flag any goal or initiative you are recommending that doesn’t have explicit support in the source material. I want synthesis of what has already been decided, not new recommendations baked in without my review.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Review step:&lt;/strong&gt; Review in Notion or Proof. Before sharing with leadership or the team, confirm that every major commitment traces to a decision that was actually made.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;How to compound:&lt;/strong&gt; After each planning cycle, add a retrospective to your context file. Did the goals prove achievable? What was missing from the original plan? Future planning sessions will be informed by past ones.&lt;/p&gt;&lt;h3&gt;&lt;strong&gt;13. Personal learning tool&lt;/strong&gt;&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Best for:&lt;/strong&gt; Anyone who wants to use Codex to support skill-building, practice, or self-directed learning.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Input sources:&lt;/strong&gt; External APIs, files, structured practice materials, your own notes.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Output artifact:&lt;/strong&gt; A custom interactive tool—like a tutor, a quiz, or a practice environment—built for your learning goal.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Example: &lt;/strong&gt;A musician wants to practice chord identification. They connect a MIDI keyboard and describe what they want, and Codex builds a small app that listens to what they play, identifies the chord, and tracks progress over time. &lt;/p&gt;&lt;p&gt;&lt;strong&gt;First prompt:&lt;/strong&gt;&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820674573-yry42c"&gt;	I want to build a personal learning tool for [skill or subject]. &lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820674573-yry42c"&gt; 	My current level: [beginner/intermediate/what I know already].&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820674573-yry42c"&gt;	What I want to practice: [specific aspect of the skill].&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820674573-yry42c"&gt;	How I want feedback: [immediate/after each session/scored].&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820674573-yry42c"&gt;	Build a prototype I can use locally. Explain what it does and how to use it before I start.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Review step:&lt;/strong&gt; Try the tool on real practice material before committing to it. Verify it is actually testing what you intended.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;How to compound:&lt;/strong&gt; After each practice session, ask Codex to update the tool based on what you found most and least useful. The tool improves as your needs become clearer.&lt;/p&gt;&lt;h2&gt;&lt;strong&gt;Part 5: Operating Codex well&lt;/strong&gt;&lt;/h2&gt;&lt;h3&gt;&lt;strong&gt;How to Steer Codex&lt;/strong&gt;&lt;/h3&gt;&lt;p&gt;Operating Codex well is management work. You evaluate talent (which prompts, agents, and workflows to trust), set vision (what to point Codex at, and what “done” should look like), exercise taste (catching output that is technically correct but wrong for the moment), and know when to let be or take the wheel.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Give Codex an outcome.&lt;/strong&gt; Describe what you want to end up with, not how to get there. “Build a research brief on [topic] with these sources and this structure” produces better results than “First search Slack, then search Notion, then...”&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Ask for a plan before long-running work.&lt;/strong&gt; For any task that will take more than a few minutes or touch multiple systems, ask Codex to explain what it’s about to do before it starts. This catches misunderstandings early and gives you a chance to redirect it before it gets too far along.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Ask Codex what it needs before it starts.&lt;/strong&gt; For complex tasks, a short briefing prompt saves time:&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820795697-av5ui3"&gt;Before you start, tell me what additional context would help you do this better. What are the most important things you would want to know?&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Require citations and audit trails for important claims.&lt;/strong&gt; Any document that will be shared or used for decisions should have source links for factual claims. Make this a standing rule in your preferences file.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Don’t over-manage every micro-step once the plan is good.&lt;/strong&gt; Once you have confirmed the approach, let Codex work. Interrupting undermines autonomous operation and produces worse results than reviewing the completed output.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Review in the destination app.&lt;/strong&gt; Always.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Set explicit no-send /no-post/no-archive/no-modify rules in your rules file.&lt;/strong&gt; These should apply by default to any sensitive workflow. Make Codex ask before taking any action that can’t easily be undone.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Three questions to ask before approving any significant output:&lt;/strong&gt;&lt;/p&gt;&lt;p data-guide-block-kind="callout" data-guide-block-id="guide-block-1779820810918-pg97kr"&gt;What was the hardest decision you made in producing this?&lt;/p&gt;&lt;p data-guide-block-kind="callout" data-guide-block-id="guide-block-1779820810918-pg97kr"&gt;What alternatives did you consider and reject?&lt;/p&gt;&lt;p data-guide-block-kind="callout" data-guide-block-id="guide-block-1779820810918-pg97kr"&gt;Where are you least confident?&lt;/p&gt;&lt;p&gt;These questions surface the judgment calls the model made, the options it dismissed, and the places most likely to contain errors.&lt;/p&gt;&lt;h3&gt;&lt;strong&gt;Safety, trust, and risks&lt;/strong&gt;&lt;/h3&gt;&lt;h4&gt;&lt;strong&gt;Risk categories&lt;/strong&gt;&lt;/h4&gt;&lt;p&gt;&lt;strong&gt;Green—proceed with standard review:&lt;/strong&gt; Summaries, outlines, internal drafts, research briefs, personal notes, low-stakes scripts.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Yellow—review carefully before sharing or acting:&lt;/strong&gt; Strategy documents, customer-support drafts, product specs, recruiting research, non-destructive data pulls, PR drafts for small changes.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Red—don’t proceed without explicit human verification:&lt;/strong&gt; Sending messages to clients or customers, changing source-of-truth data, making production code changes, moving money, legal or compliance claims, unreconciled metrics used for business decisions.&lt;/p&gt;&lt;h4 data-guide-block-kind="callout" data-guide-block-id="guide-block-1779820870615-20fyx4"&gt;&lt;strong&gt;Common failure modes and how to handle them&lt;/strong&gt;&lt;/h4&gt;&lt;p data-guide-block-kind="callout" data-guide-block-id="guide-block-1779820870615-20fyx4"&gt;&lt;em&gt;Confident wrongness.&lt;/em&gt; Codex can state incorrect facts with high confidence. For any factual claim that matters, verify against the source. Never pass a statistic or claim to another person without checking it.&lt;/p&gt;&lt;p data-guide-block-kind="callout" data-guide-block-id="guide-block-1779820870615-20fyx4"&gt;&lt;em&gt;Metrics errors.&lt;/em&gt; Joining data from multiple sources introduces definition mismatches and calculation errors. Verify column by column for any metric used in decisions.&lt;/p&gt;&lt;p data-guide-block-kind="callout" data-guide-block-id="guide-block-1779820870615-20fyx4"&gt;&lt;em&gt;Out-of-scope changes.&lt;/em&gt; Codex sometimes modifies files or makes improvements adjacent to the task you assigned. Review the changes line by line (called a “diff”), not just the final output, especially for any task involving code.&lt;/p&gt;&lt;p data-guide-block-kind="callout" data-guide-block-id="guide-block-1779820870615-20fyx4"&gt;&lt;em&gt;Automations that break.&lt;/em&gt; Persistent workflows stop working when tools update their APIs, credentials expire, or context files become stale. Every automation needs an owner who checks it periodically. Sever that connection—stop tending it—and the agent stops being useful. “Set it and forget it” isn’t a stable operating mode.&lt;/p&gt;&lt;p data-guide-block-kind="callout" data-guide-block-id="guide-block-1779820870615-20fyx4"&gt;&lt;em&gt;Plugin and integration failure.&lt;/em&gt; Plugins and integrations need maintenance: Permissions expire, APIs change, configurations need updates, and some changes require restarting Codex. Integration failures—particularly with Notion and Gmail—happen and aren’t always obvious. If a workflow produces strange output, check whether the connection is still working before assuming the prompt is wrong.&lt;/p&gt;&lt;p data-guide-block-kind="callout" data-guide-block-id="guide-block-1779820870615-20fyx4"&gt;&lt;em&gt;Usage limits.&lt;/em&gt; Long-running sessions can hit usage limits and stop mid-task. For complex workflows, break work into stages rather than attempting everything in a single session.&lt;/p&gt;&lt;p data-guide-block-kind="callout" data-guide-block-id="guide-block-1779820870615-20fyx4"&gt;&lt;em&gt;Untrusted input&lt;/em&gt;. Anything Codex reads—an email, a web page, a shared document, a support ticket—can contain instructions aimed at the agent rather than at you, sometimes hidden from human eyes. If Codex is browsing untrusted sites or processing external messages while holding broad write access, those buried instructions can turn into actions—like sending data where it shouldn’t go. So keep destructive actions behind approval, and scope each workflow to the least access it needs, so a hijacked instruction has nowhere to go.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;The human ownership standard:&lt;/strong&gt; Codex can touch any artifact in your workflow, but a human must direct the work, stand behind the output, and be able to discuss any specific decision in it. If someone asks you about a bullet point in a document Codex drafted, you should be able to answer. An AI-drafted document is fine—expected, even—but if someone talks it through with you and it’s clear you have no idea what’s in it, that’s a problem.&lt;/p&gt;&lt;h3&gt;&lt;strong&gt;Team workflows: From personal Codex to shared operating system&lt;/strong&gt;&lt;/h3&gt;&lt;p&gt;Individual Codex workflows compound over time. Team workflows compound faster but require coordination.&lt;/p&gt;&lt;h4&gt;&lt;strong&gt;What changes when a team uses Codex&lt;/strong&gt;&lt;/h4&gt;&lt;p&gt;Teams build trust in agents through the humans who operate them. When a colleague receives a document or plan that Codex drafted, they trust it to the degree they trust the person who shared it.&lt;/p&gt;&lt;h4&gt;&lt;strong&gt;Infrastructure that makes team Codex work&lt;/strong&gt;&lt;/h4&gt;&lt;p&gt;&lt;em&gt;Shared review surfaces.&lt;/em&gt; A shared document review tool (Proof, Notion, Google Docs) makes agent-generated documents easier to inspect and comment on than outputs reviewed only inside Codex.&lt;/p&gt;&lt;p&gt;&lt;em&gt;Codex-mediated routing. &lt;/em&gt;Teams can combine Codex threads, automations, Slack or GitHub integrations, remote connections, and app-server APIs to build routing workflows: Requests arrive in Slack, email, or another shared intake surface; Codex helps triage them, creates reviewable tasks or drafts, and routes the work to the right human or Codex workspace for execution. Each route needs clear ownership, permissions, review rules, and a source of truth. For teams doing a lot of cross-functional requests, such as legal reviews, data pulls, or copy approvals, this pattern removes significant coordination overhead.&lt;/p&gt;&lt;p&gt;A key mechanic to making this style of work possible is giving Codex its own email address. Codex doesn’t come with one—you set it up with a tool like &lt;u&gt;&lt;a href="https://www.nylas.com/" rel="noopener noreferrer" target="_blank"&gt;Nylas&lt;/a&gt;&lt;/u&gt; that gives an agent an inbox. Once it has that address, you can treat it like another teammate. Routes built on an email address still need the same discipline as any other: a clear owner, scoped permissions, and a review step before anything goes back out.&lt;/p&gt;&lt;p&gt;&lt;em&gt;Agent-readable shared documentation.&lt;/em&gt; Plans, strategy documents, and operational guides written for both human and agent readers become shared infrastructure. Any team member—or any team member’s agent—can query them for specific information without interrupting the author.&lt;/p&gt;&lt;p&gt;&lt;em&gt;Explicit ownership.&lt;/em&gt; Every persistent workflow needs a named owner. That person is responsible for monitoring output quality, updating the workflow when it breaks, and retiring it when it’s no longer useful. Automation degrades without ownership.&lt;/p&gt;&lt;h4 data-guide-block-kind="callout" data-guide-block-id="guide-block-1779820923832-39cry6"&gt;A simple way to get a team to use Codex&lt;/h4&gt;&lt;p data-guide-block-kind="callout" data-guide-block-id="guide-block-1779820923832-39cry6"&gt;Don’t try to convert everyone. As a rule of thumb, a tenth of any team will adopt a new tool no matter what, a tenth never will, and the other 80 percent come along once someone shows them how it helps their own job. Aim at that 80 percent. Three things, done together, help along adoption:&lt;/p&gt;&lt;ol&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1779820923832-39cry6"&gt;A note from a leader that makes using AI the expectation, not a nice-to-have&lt;/li&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1779820923832-39cry6"&gt;A weekly meeting where anyone can show a prompt or workflow they’ve built&lt;/li&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1779820923832-39cry6"&gt;A regular message that names the people whose work stood out &lt;/li&gt;&lt;/ol&gt;&lt;p data-guide-block-kind="callout" data-guide-block-id="guide-block-1779820923832-39cry6"&gt;Set the expectation, give people a place to share what works, and recognize them for it—that’s most of the battle.&lt;/p&gt;&lt;h2&gt;&lt;strong&gt;Part 6: Getting started&lt;/strong&gt;&lt;/h2&gt;&lt;h3&gt;&lt;strong&gt;The seven-day Codex power-user plan &lt;/strong&gt;&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Day 1: Connect and inspect.&lt;/strong&gt; Install the Codex desktop app. Connect your primary tools—Gmail, Slack, Notion, Drive, and any analytics or support tools you use. Run the workflow discovery prompt from Part 2 and review the three automation suggestions Codex returns. Don’t build anything yet. Just read the suggestions and identify which one is most useful.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Day 2: Create your context files.&lt;/strong&gt; Create your codex-workspace/ folder. Write context.md, preferences.md, and rules.md. Keep each one to one page. The goal is to capture the most important things Codex should know about you—not to be exhaustive.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Day 3: Run three one-off tasks.&lt;/strong&gt; Choose one summary task, one research brief, and one draft or plan. Use the prompt patterns from Level 1. Review each output carefully and note where Codex got things right and where it needed correction.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Day 4: Build your first workflow.&lt;/strong&gt; Take the most useful automation suggestion from Day 1 and fill out the workflow canvas from Level 3. Save it to workflows/ in your workspace. Run it once manually and verify the output.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Day 5: Add review rules.&lt;/strong&gt; Create reviews/data-checklist.md, reviews/writing-checklist.md, and reviews/comms-checklist.md. Start each one with five checks based on what you noticed during Days 3 and 4. These will grow over time.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Day 6: Turn one workflow into a reusable artifact.&lt;/strong&gt; Take the workflow from Day 4 and document the prompt, the output format, the review step, and any known edge cases. Save it as a complete workflow file. Run it again and verify the documentation is accurate.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Day 7: Compound.&lt;/strong&gt; Run the compounding prompt at the end of your Codex session:&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820956716-yzcjc0"&gt;Based on everything we have done this week, what should become a reusable skill,&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820956716-yzcjc0"&gt;an automation, or a small tool? What context should I add to my project files&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820956716-yzcjc0"&gt;so future sessions start from a better baseline?&lt;/p&gt;&lt;p&gt;Review Codex’s suggestions and implement the one that would save the most time over the next month.&lt;/p&gt;&lt;p data-guide-block-kind="callout" data-guide-block-id="guide-block-1779820994207-1bypew"&gt;&lt;strong&gt;30-day extension:&lt;/strong&gt;&lt;/p&gt;&lt;ul&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1779820994207-1bypew"&gt;Week 1: One personal workflow running reliably&lt;/li&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1779820994207-1bypew"&gt;Week 2: One multi-source workflow pulling from at least three connected tools&lt;/li&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1779820994207-1bypew"&gt;Week 3: One small tool or automation that handles a chore without your involvement&lt;/li&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1779820994207-1bypew"&gt;Week 4: One shared or team workflow with explicit ownership and review cadence&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;Start today. Connect the tools you’re comfortable permissioning and ask Codex what recurring workflows it can see from the available context. That question, and what you do with the answer, is the gateway to the Codex universe.&lt;/p&gt;&lt;p&gt;Codex is easy to underestimate. At first glance it looks like another AI coding tool; if you’re not an engineer, a natural conclusion is that it’s not for you.&lt;/p&gt;&lt;p&gt;That reading misses how much Codex makes possible. &lt;/p&gt;&lt;p&gt;Picture a Monday morning: A request for a launch plan lands in your inbox. You forward it to Codex, which has its own email account, and close your laptop while Codex runs tasks in the cloud, or on a machine like a Mac Mini that you keep active. On your commute to the office, you get an email notification on your phone: Codex has read the relevant Slack threads, pulled customer notes out of Google Drive, checked last quarter’s numbers in PostHog, and started a go-to-market plan in a shared Notion document. It just needs you to confirm one detail about timing, which you do with a thumbs-up. By the time you reach your desk, a draft is waiting for review. &lt;/p&gt;&lt;p&gt;This is a day in the life of an agent-pilled knowledge worker. It all runs on OpenAI’s agent, Codex, in the Codex desktop app. We use “Codex” to refer to the app throughout this guide. &lt;/p&gt;&lt;p&gt;Codex is a workspace for you and your AI agents. Give Codex access to the files, apps, and tools it needs, and it gathers context, moves through the task across every surface it can reach—including your connected apps, the browser, and your computer. That makes it useful not just for code, but for a broad range of knowledge work.&lt;/p&gt;&lt;p&gt;There are two ways to work with agents in Codex: &lt;strong&gt;Delegate&lt;/strong&gt; or &lt;strong&gt;collaborate.&lt;/strong&gt;&lt;/p&gt;&lt;ul&gt;&lt;li&gt;&lt;strong&gt;Delegate&lt;/strong&gt; tasks that are predictable, repeatable, and low-risk. With clear, well-specified instructions, the agent can execute autonomously and bring back finished work for your review.&lt;/li&gt;&lt;li&gt;&lt;strong&gt;Collaborate&lt;/strong&gt; on tasks that are judgment-heavy, exploratory, or iterative. You work alongside the model toward an outcome that matches your vision.&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;AI progress has reached a point where expertise is easy to replicate. Each new model can do more of what used to require rare skill—which creates both more opportunity and more noise. The people who work best in this environment know how to direct AI’s capability without losing their personal judgment. They &lt;u&gt;&lt;a href="https://every.to/p/after-automation" rel="noopener noreferrer" target="_blank"&gt;ride the models&lt;/a&gt;&lt;/u&gt; rather than being overwhelmed by them.&lt;/p&gt;&lt;p&gt;Expert Codex users are one of the clearest examples of what that looks like in practice.&lt;/p&gt;&lt;p&gt;This guide is about becoming one of those people. It covers how to set up a workspace, run high-leverage knowledge-work tasks, and turn repeated work into durable systems that get better over time. If you’re ready to think of your work in terms of systems instead of one-off tasks, this guide is for you.&lt;/p&gt;&lt;p&gt;&lt;br&gt;&lt;/p&gt;&lt;p data-guide-block-kind="agent-buttons" data-guide-block-id="guide-block-1779827761591-u9k6gl"&gt;&lt;br&gt;&lt;/p&gt;&lt;h2&gt;&lt;strong&gt;Part 1: Understanding Codex&lt;/strong&gt;&lt;/h2&gt;&lt;h3&gt;What Codex is&lt;/h3&gt;&lt;p&gt;Codex is a tool-using agentic workspace: You give it a goal and it plans the work, uses available tools and context, and produces a result for you to review. It can read and write files on your computer, connect to external services through plugins and other integrations, run multi-step tasks without asking for guidance, generate code and scripts when a task needs them, and maintain context across a persistent workspace.&lt;/p&gt;&lt;p&gt;Specific capabilities that make Codex worth using: &lt;/p&gt;&lt;ul&gt;&lt;li&gt;Works alongside you on multiple tasks in parallel&lt;/li&gt;&lt;li&gt;Pulls context from the apps and files you connect&lt;/li&gt;&lt;li&gt;Uses a supported browser and desktop workflows when a task needs on-screen action&lt;/li&gt;&lt;li&gt;Checks its own work, revises, and keeps going&lt;/li&gt;&lt;li&gt;Holds a persistent goal across a long-running session, instead of treating each message as a one-off request&lt;/li&gt;&lt;li&gt;Turns repeatable tasks into recurring workflows&lt;/li&gt;&lt;li&gt;Helps route shared requests from places like Slack, email, or forms&lt;/li&gt;&lt;li&gt;Lets you start, steer, approve, and review work from your phone while Codex works in the cloud or on a machine, such as a Mac Mini, that you keep awake&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;These capabilities make Codex useful both for delegating well-specified tasks and as a shared workspace for human-agent collaboration. Deciding which mode fits which needs is the meta-skill of modern knowledge work.&lt;strong&gt; &lt;/strong&gt;&lt;/p&gt;&lt;h4 data-guide-block-kind="callout" data-guide-block-id="guide-block-1779817335666-b66wia"&gt;&lt;strong&gt;A note on Goals&lt;/strong&gt;&lt;/h4&gt;&lt;p data-guide-block-kind="callout" data-guide-block-id="guide-block-1779817335666-b66wia"&gt;A Goal in Codex, initiated using the &lt;code&gt;/goal&lt;/code&gt; command, is a persistent objective that shapes an entire session rather than living and dying with a single message. Instead of re-briefing the agent on every turn, you tell it what “done” looks like, how success gets checked, and which constraints to respect. Codex then keeps working toward that outcome across interruptions and session breaks. Goals let you delegate long-horizon work, collaborate without losing the thread, and compound progress over time instead of restarting from scratch. &lt;/p&gt;&lt;p data-guide-block-kind="callout" data-guide-block-id="guide-block-1779817335666-b66wia"&gt;A simple test for when to use &lt;code&gt;/goal&lt;/code&gt;: If you’d type the same sentence into three prompts in a row—“cite every factual claim, match the house style, never send without my review”—make it a goal instead.&lt;/p&gt;&lt;p data-guide-block-kind="callout" data-guide-block-id="guide-block-1779817335666-b66wia"&gt;&lt;strong&gt;Goals versus skills. &lt;/strong&gt;A skill is a reusable set of packaged instructions (sometimes with scripts) that teaches Codex how to handle a recurring kind of task well. A goal, on the other hand, is what you’re trying to accomplish in a given stretch of work. It guides one session until the objective is met, then it’s done.&lt;/p&gt;&lt;h4&gt;&lt;strong&gt;Codex on mobile&lt;/strong&gt;&lt;/h4&gt;&lt;p&gt;Codex also runs from your phone through the ChatGPT mobile app, remotely controlling the machine where your work is happening. The mobile app suits the lightweight parts of a workflow: You can kick off a task, answer a question, approve an action, or review a draft from anywhere. Heavier review still deserves a real screen.&lt;/p&gt;&lt;h3&gt;What Codex isn’t&lt;/h3&gt;&lt;p&gt;Codex isn’t a magic intern that can safely act without supervision. It isn’t a replacement for taste, judgment, or ownership. It isn’t a replacement for human review or fact-checking. It isn’t useful for tasks where the source data is inaccessible, the criteria for success are entirely subjective, or the stakes of an error are too high to allow autonomous action.&lt;/p&gt;&lt;h4&gt;&lt;strong&gt;Useful rules&lt;/strong&gt;&lt;/h4&gt;&lt;p&gt;A task is a good candidate for Codex if it has at least two of the following traits:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;It requires pulling data from multiple sources.&lt;/li&gt;&lt;li&gt;It involves repeated steps you do regularly.&lt;/li&gt;&lt;li&gt;It can be checked against objective criteria.&lt;/li&gt;&lt;li&gt;It produces a durable artifact—a document, a plan, a report, a script.&lt;/li&gt;&lt;li&gt;It benefits from synthesis across many inputs.&lt;/li&gt;&lt;li&gt;It’s annoying enough that you routinely delay or avoid it.&lt;/li&gt;&lt;/ul&gt;&lt;p data-guide-block-kind="callout" data-guide-block-id="guide-block-1779817287018-p1yi0r"&gt;&lt;strong&gt;Delegate &lt;/strong&gt;tasks&lt;strong&gt; &lt;/strong&gt;when they are: &lt;/p&gt;&lt;ul&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1779817287018-p1yi0r"&gt;Repeatable&lt;/li&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1779817287018-p1yi0r"&gt;Objective&lt;/li&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1779817287018-p1yi0r"&gt;Checkable&lt;/li&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1779817287018-p1yi0r"&gt;Low-risk&lt;/li&gt;&lt;/ul&gt;&lt;p data-guide-block-kind="callout" data-guide-block-id="guide-block-1779817287018-p1yi0r"&gt;&lt;strong&gt;Collaborate &lt;/strong&gt;on tasks that are: &lt;/p&gt;&lt;ul&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1779817287018-p1yi0r"&gt;Ambiguous&lt;/li&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1779817287018-p1yi0r"&gt;Judgment-heavy&lt;/li&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1779817287018-p1yi0r"&gt;Exploratory&lt;/li&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1779817287018-p1yi0r"&gt;Iterative&lt;/li&gt;&lt;/ul&gt;&lt;h4&gt;&lt;strong&gt;Codex, Claude Code, and Claude Cowork&lt;/strong&gt;&lt;/h4&gt;&lt;p&gt;If you’ve used Claude Code, you already have a mental model for an agent that works on your machine. For broader knowledge work, OpenAI and Anthropic have arrived at a similar experience from different directions. &lt;/p&gt;&lt;p&gt;Anthropic packages everything into one Claude app with three modes: Chat, Code, and Cowork. Code began as a terminal tool for developers (Claude Code) and now has a graphical version inside the app—no terminal required. It’s built for code repositories, but with the right connectors it handles a lot of general knowledge work too. Cowork takes the same engine and aims it at non-coding work, with folder access, Chrome browsing, computer use, scheduled tasks, and persistent project memory.&lt;/p&gt;&lt;p&gt;Codex is OpenAI’s counterpart, but rather than split the work across modes, it puts coding and knowledge work in a single workspace. A few things give Codex an edge for knowledge work today:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;&lt;strong&gt;One surface, not two.&lt;/strong&gt; Anthropic splits agentic work between Code and Cowork; Codex handles both in the same place, so you’re never deciding which mode a task belongs in.&lt;/li&gt;&lt;li&gt;&lt;strong&gt;A browser that works beside you.&lt;/strong&gt; Codex renders the pages inside the app itself as a shared view between you and the agent. The Claude app operates a stand-alone Chrome window or your full screen instead. For logged-in sites, both rely on a Chrome extension. In our experience, Codex’s built-in browser tends to be faster, more reliable, and more useful for collaborative work.&lt;/li&gt;&lt;li&gt;&lt;strong&gt;Connectors out of the box. &lt;/strong&gt;Codex comes with a catalog of connectors you authorize in a click; in the Claude app you add tools as MCP servers, which requires a bit more assembly. &lt;/li&gt;&lt;/ul&gt;&lt;p&gt;Which surface is right comes down to model preference and workflow habits; Codex has the edge for us today—but the labs ship fast, and that can change.&lt;/p&gt;&lt;h3&gt;&lt;strong&gt;The Codex knowledge work loop&lt;/strong&gt;&lt;/h3&gt;&lt;p&gt;Every sustainable Codex workflow follows the same five-step pattern:&lt;/p&gt;&lt;h3 data-guide-block-kind="callout" data-guide-block-id="guide-block-1779817436733-2mrhhk"&gt;Connect → Contextualize → Delegate/collaborate → Review → Compound&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Connect: &lt;/strong&gt;Give Codex access to the systems you use for work—Gmail, Slack, Notion, Google Drive, your calendar, your analytics tools, your support platform, and/or local files. Without connected apps or source access, Codex is limited to the local/project files it can access, uploaded or linked materials, and context you provide in the thread. With connections, it can find what it needs on its own.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Contextualize:&lt;/strong&gt; Put your goals, preferences, project details, source links, review standards, and standing rules in files Codex can access, then cite those files in Codex’s &lt;u&gt;&lt;a href="http://agents.md" rel="noopener noreferrer" target="_blank"&gt;AGENTS.md&lt;/a&gt;&lt;/u&gt; file to make them readily available. This is the difference between an agent that has to be re-briefed every time and one that already understands who you are, what you’re working on, and how you like to work.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Delegate/collaborate:&lt;/strong&gt; Decide whether the task needs close collaboration or can run on its own. Either way, specify inputs, output format, and acceptance criteria, then let it work.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Review:&lt;/strong&gt; Check the output in the destination app. If Codex drafted Slack messages, review them in Slack. If it wrote a strategy document, review it in your word processor of choice, such as Google Docs, Notion, or &lt;strong&gt;&lt;u&gt;&lt;a href="https://proofeditor.ai/" rel="noopener noreferrer" target="_blank"&gt;Proof&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;. Content that looks fine in a terminal or the Codex app may read differently in the space where it will ultimately be used.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Compound:&lt;/strong&gt; Turn what works into something reusable. Save the prompt. Document the workflow. Add mistakes to your review checklist and keep your context files up to date. Each session should make future sessions faster.&lt;/p&gt;&lt;h2&gt;&lt;strong&gt;Part 2: Setup&lt;/strong&gt;&lt;/h2&gt;&lt;h3&gt;&lt;strong&gt;Connect your systems&lt;/strong&gt;&lt;/h3&gt;&lt;p&gt;Connect the tools you want Codex to have access to. This includes Gmail, Slack, Notion, Google Drive, your calendar, analytics tools, support platforms, or anything else for which Codex has an integration. Once the relevant tools are connected, Codex can look at your actual work context and suggest workflows based on your messages, files, meetings, and recurring tasks.&lt;/p&gt;&lt;p&gt;Connecting a tool isn’t the same thing as letting Codex act on it. Across everything you connect, Codex can read and draft while still asking for your approval before it sends, posts, archives, or deletes. That makes broad access low-risk early on: Connect generously so Codex can find workflows worth building. Then, once you know which ones you’ll keep, disconnect the tools you don’t need to reduce risk and limit unnecessary data exposure.&lt;/p&gt;&lt;h4 data-guide-block-kind="callout" data-guide-block-id="guide-block-1779818908797-042bpz"&gt;&lt;strong&gt;Three ways Codex reaches your tools&lt;/strong&gt;&lt;/h4&gt;&lt;p data-guide-block-kind="callout" data-guide-block-id="guide-block-1779818908797-042bpz"&gt;Codex can touch the same tool in more than one way, and knowing which access path is which saves a lot of confusion:&lt;/p&gt;&lt;ul&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1779818908797-042bpz"&gt;&lt;strong&gt;Connectors (plugins)&lt;/strong&gt; give Codex structured, API-level access to an app—Gmail, Slack, Notion, your analytics tools. This is the most reliable and repeatable option, so use it whenever a connector exists.&lt;/li&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1779818908797-042bpz"&gt;&lt;strong&gt;Browser use&lt;/strong&gt; lets Codex operate a web page directly through its in-app browser—useful for local previews, public pages, and anything you want to watch it do on a shared screen. For sites that require you to be signed in, like your email client, the Codex Chrome extension works inside your logged-in browser.&lt;/li&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1779818908797-042bpz"&gt;&lt;strong&gt;Computer use&lt;/strong&gt; lets Codex see and operate your desktop the way a person would—clicking through an app, changing a setting, or working with software that only exists as a graphical interface.&lt;/li&gt;&lt;/ul&gt;&lt;p data-guide-block-kind="callout" data-guide-block-id="guide-block-1779818908797-042bpz"&gt;The rule of thumb: Reach for a connector first, the browser next, and computer use when nothing else can get to the task.&lt;/p&gt;&lt;p&gt;Starting prompt—use this once your integrations are set up:&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779817720669-ddieaq"&gt;Connect to the tools I use for work: [List your tools—Gmail, Slack, Notion, Drive, etc.]. Then look at my work patterns across those tools and suggest three workflows I should set up first. For each one, describe the input sources, the output artifact, how often it should run, what approval looks like, and what would make the workflow worth keeping long-term.&lt;/p&gt;&lt;p&gt;Once the relevant tools are connected and permissioned, this prompt lets Codex inspect the available work context and suggest automation candidates rather than forcing you to invent them.&lt;/p&gt;&lt;h3&gt;&lt;strong&gt;Build your Codex workspace&lt;/strong&gt;&lt;/h3&gt;&lt;p&gt;Build Codex’s workspace before running any workflows. Skip this step and you’ll likely stall. &lt;/p&gt;&lt;p&gt;A Codex workspace is a folder—local on your machine, synced to GitHub if you want version control—that contains the context files, workflow instructions, and review standards Codex reads at the start of each session. Think of it as an onboarding manual the agent reads at the start of each session.&lt;/p&gt;&lt;h4&gt;&lt;strong&gt;An example workspace structure&lt;/strong&gt;&lt;/h4&gt;&lt;p data-guide-block-kind="terminal" data-guide-block-id="guide-block-1779817905236-7c2762"&gt;your-workspace/&lt;/p&gt;&lt;p data-guide-block-id="guide-block-1779817905236-7c2762" data-guide-block-kind="terminal"&gt;├── README.md                  # Start here—orientation&lt;/p&gt;&lt;p data-guide-block-kind="terminal" data-guide-block-id="guide-block-1779817905236-7c2762"&gt;├── identity/                  # About you&lt;/p&gt;&lt;p data-guide-block-kind="terminal" data-guide-block-id="guide-block-1779817905236-7c2762"&gt;│   ├── context.md&lt;/p&gt;&lt;p data-guide-block-kind="terminal" data-guide-block-id="guide-block-1779817905236-7c2762"&gt;│   ├── preferences.md&lt;/p&gt;&lt;p data-guide-block-kind="terminal" data-guide-block-id="guide-block-1779817905236-7c2762"&gt;│   └── rules.md&lt;/p&gt;&lt;p data-guide-block-kind="terminal" data-guide-block-id="guide-block-1779817905236-7c2762"&gt;├── playbooks/                 # Process—repeatable workflows&lt;/p&gt;&lt;p data-guide-block-kind="terminal" data-guide-block-id="guide-block-1779817905236-7c2762"&gt;│   ├── workflows/&lt;/p&gt;&lt;p data-guide-block-kind="terminal" data-guide-block-id="guide-block-1779817905236-7c2762"&gt;│   ├── inbox-sweep.md&lt;/p&gt;&lt;p data-guide-block-kind="terminal" data-guide-block-id="guide-block-1779817905236-7c2762"&gt;│   └── research-brief.md&lt;/p&gt;&lt;p data-guide-block-kind="terminal" data-guide-block-id="guide-block-1779817905236-7c2762"&gt;├── sources/                   # Source shelf—inputs&lt;/p&gt;&lt;p data-guide-block-kind="terminal" data-guide-block-id="guide-block-1779817905236-7c2762"&gt;│   ├── sources/&lt;/p&gt;&lt;p data-guide-block-kind="terminal" data-guide-block-id="guide-block-1779817905236-7c2762"&gt;│   ├── key-links.md&lt;/p&gt;&lt;p data-guide-block-kind="terminal" data-guide-block-id="guide-block-1779817905236-7c2762"&gt;│   └── recurring-docs.md&lt;/p&gt;&lt;p data-guide-block-kind="terminal" data-guide-block-id="guide-block-1779817905236-7c2762"&gt;├── outputs/                   # Finished work&lt;/p&gt;&lt;p data-guide-block-kind="terminal" data-guide-block-id="guide-block-1779817905236-7c2762"&gt;│   ├── outputs/&lt;/p&gt;&lt;p data-guide-block-kind="terminal" data-guide-block-id="guide-block-1779817905236-7c2762"&gt;│   ├── drafts/&lt;/p&gt;&lt;p data-guide-block-kind="terminal" data-guide-block-id="guide-block-1779817905236-7c2762"&gt;│   └── reports/&lt;/p&gt;&lt;p data-guide-block-kind="terminal" data-guide-block-id="guide-block-1779817905236-7c2762"&gt;└── reviews/                   # Quality checks—guardrails&lt;/p&gt;&lt;p data-guide-block-kind="terminal" data-guide-block-id="guide-block-1779817905236-7c2762"&gt;    ├── data-checklist.md&lt;/p&gt;&lt;p data-guide-block-kind="terminal" data-guide-block-id="guide-block-1779817905236-7c2762"&gt;    └── writing-checklist.md&lt;/p&gt;&lt;p&gt;What you’re doing here has a name: context engineering—a term popularized by Shopify CEO &lt;strong&gt;&lt;u&gt;&lt;a href="https://x.com/tobi/status/1935533422589399127" rel="noopener noreferrer" target="_blank"&gt;Toby Lütke&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; and prominent AI engineer &lt;strong&gt;&lt;u&gt;&lt;a href="https://x.com/karpathy/status/1937902205765607626?lang=en" rel="noopener noreferrer" target="_blank"&gt;Andrej Karpathy&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;. Getting the right context to the model at the right time accounts for at least half of its performance. &lt;/p&gt;&lt;p&gt;At the start of each session, Codex looks at &lt;code&gt;AGENTS.md&lt;/code&gt;, which works as the table of contents. You can write your standing instructions directly in it, but we recommend keeping &lt;code&gt;AGENTS.md&lt;/code&gt; short and pointing it at more detailed files: &lt;code&gt;context.md&lt;/code&gt; for who you are and what you’re working on, &lt;code&gt;preferences.md&lt;/code&gt; for how you want the work done, and &lt;code&gt;rules.md&lt;/code&gt; for what it may and may not do without asking.&lt;/p&gt;&lt;h4 data-guide-block-kind="callout" data-guide-block-id="guide-block-1779818686951-lpvg9i"&gt;&lt;strong&gt;What to put in your context files&lt;/strong&gt;&lt;/h4&gt;&lt;p data-guide-block-kind="callout" data-guide-block-id="guide-block-1779818686951-lpvg9i"&gt;&lt;code&gt;context.md&lt;/code&gt; should cover:&lt;/p&gt;&lt;ul&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1779818686951-lpvg9i"&gt;Your role and the function you own&lt;/li&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1779818686951-lpvg9i"&gt;Active projects and their current status&lt;/li&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1779818686951-lpvg9i"&gt;The tools you use daily and what each one is for&lt;/li&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1779818686951-lpvg9i"&gt;The people or teams you work with most closely&lt;/li&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1779818686951-lpvg9i"&gt;How decisions typically get made in your context&lt;/li&gt;&lt;/ul&gt;&lt;p data-guide-block-kind="callout" data-guide-block-id="guide-block-1779818686951-lpvg9i"&gt;&lt;code&gt;preferences.md&lt;/code&gt; should cover:&lt;/p&gt;&lt;ul&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1779818686951-lpvg9i"&gt;Writing style and tone (formal or conversational, terse or thorough)&lt;/li&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1779818686951-lpvg9i"&gt;Communication preferences (what you like to review before it goes out and what can be drafted and queued without your involvement)&lt;/li&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1779818686951-lpvg9i"&gt;Decision-making preferences (when to ask before acting and when to proceed and report back)&lt;/li&gt;&lt;/ul&gt;&lt;p data-guide-block-kind="callout" data-guide-block-id="guide-block-1779818686951-lpvg9i"&gt;&lt;code&gt;rules.md&lt;/code&gt; should cover:&lt;/p&gt;&lt;ul&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1779818686951-lpvg9i"&gt;What Codex may never do without explicit approval: Send, post, archive, delete, modify a source of truth, or move money&lt;/li&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1779818686951-lpvg9i"&gt;What Codex may do without asking: Draft, summarize, research, outline, organize&lt;/li&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1779818686951-lpvg9i"&gt;Any standing constraints specific to your work (e.g., client confidentiality rules, brand standards, data handling requirements)&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;Starting prompt—use this to have Codex create your workspace structure:&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779818637346-s57oa8"&gt;[First: Create a folder on your desktop called “Codex”] &lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779818637346-s57oa8"&gt;Set up this folder as a simple Codex workspace for knowledge work.&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779818637346-s57oa8"&gt;Create three starter files:&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779818637346-s57oa8"&gt;    1. context.md—who I am, what I’m working on, what tools I use, and who I work with&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779818637346-s57oa8"&gt;    2. preferences.md—how I like work to be written, reviewed, and handled&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779818637346-s57oa8"&gt;    3. rules.md—what you may do without asking, what you must ask before doing, and what you must never do&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779818637346-s57oa8"&gt;Interview me one question at a time to gather the information you need to fill in each file.&lt;/p&gt;&lt;h4&gt;&lt;strong&gt;The “one pinned chat per project” rule &lt;/strong&gt;&lt;/h4&gt;&lt;p&gt;The workspace folder is for your context; pinned chats are for your work. You can find the option to pin a chat next to the chat name in the app’s lefthand navigation bar. A useful habit from day one is to keep one persistent, pinned thread per project or area of responsibility—one for the product launch, one for weekly reporting, one for recruiting—rather than spinning up a fresh chat for every request. A standing thread accumulates context as you go, so Codex remembers what you have already established and you don’t have to re-explain the project each time. A pinned chat with a goal and the thread itself turns Codex into a reliable home for that stream of work.&lt;/p&gt;&lt;h2&gt;&lt;strong&gt;Part 3: The five levels of Codex use&lt;/strong&gt;&lt;/h2&gt;&lt;p&gt;Codex power users don’t arrive there all at once. They get there in stages, and each stage calls for a different way of thinking about what Codex is doing and what it’s good for. Skip ahead too quickly, and you’ll get frustrated —either you don’t trust it yet, or you haven’t built the infrastructure for more autonomous work. At every level, you should know when to hand work to Codex and when to stay in the loop as its collaborator.&lt;/p&gt;&lt;h3&gt;&lt;strong&gt;Level 1: One-off knowledge work&lt;/strong&gt;&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Mental model:&lt;/strong&gt; Codex as a capable, thorough research and drafting assistant.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Mode: &lt;/strong&gt;Collaborate.&lt;strong&gt; &lt;/strong&gt;At this level, nothing is automated. You run single-session tasks, review everything before it leaves your hands, and build familiarity with how Codex handles different types of work.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Best first tasks:&lt;/strong&gt;&lt;/p&gt;&lt;ul&gt;&lt;li&gt;Summarize a meeting transcript and extract decisions, open questions, and follow-up actions.&lt;/li&gt;&lt;li&gt;Turn scattered notes into a structured outline.&lt;/li&gt;&lt;li&gt;Build a research brief from a set of links and documents.&lt;/li&gt;&lt;li&gt;Rewrite a draft against a style guide.&lt;/li&gt;&lt;li&gt;Create a review checklist for a document, launch plan, or strategy memo.&lt;/li&gt;&lt;li&gt;Convert a written draft into an audio file for editing on the go.&lt;/li&gt;&lt;/ul&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779818837556-dazs6l"&gt;Use the attached [documents/links/notes] to produce [specific artifact]. Prioritize accuracy over elegance. Include source links for any factual claims. Flag anything uncertain or that requires my verification. End with the three questions I should answer before this artifact is ready to use.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Review habit:&lt;/strong&gt; Before polishing any output, ask Codex to list the assumptions it made and where it is least confident. This surfaces problems before you invest time in refinement.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Move to Level 2 when:&lt;/strong&gt; You keep wishing Codex remembered what you told it last time.&lt;/p&gt;&lt;h3&gt;&lt;strong&gt;Level 2: Multi-source workflows&lt;/strong&gt;&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Mental model:&lt;/strong&gt; Codex as a cross-system analyst that can assemble information you could never pull together manually in a reasonable amount of time.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Mode: &lt;/strong&gt;Collaborate. At this level, Codex can synthesize outputs from multiple connected systems—Slack threads, Notion pages, email archives, analytics dashboards, and Google Drive documents—but it still needs guidance and feedback.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Example multi-source tasks:&lt;/strong&gt;&lt;/p&gt;&lt;ul&gt;&lt;li&gt;A go-to-market plan built from internal meeting transcripts, Slack discussions, customer notes, and a strategy template&lt;/li&gt;&lt;li&gt;A weekly KPI report from analytics, revenue data, support volume, and social metrics&lt;/li&gt;&lt;li&gt;A summary synthesized from Slack, Notion, Drive links, and past drafts&lt;/li&gt;&lt;li&gt;A weekly leadership brief assembled from team standups, metrics, and open decisions&lt;/li&gt;&lt;/ul&gt;&lt;p data-guide-block-label="How to delegate a multi-source task" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779819023849-bi5kjp"&gt;I need [specific artifact].&lt;/p&gt;&lt;p data-guide-block-label="How to delegate a multi-source task" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779819023849-bi5kjp"&gt;Sources to use:&lt;/p&gt;&lt;p data-guide-block-label="How to delegate a multi-source task" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779819023849-bi5kjp"&gt; - [Tool 1]: [what to look for there]&lt;/p&gt;&lt;p data-guide-block-label="How to delegate a multi-source task" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779819023849-bi5kjp"&gt; - [Tool 2]: [what to look for there]&lt;/p&gt;&lt;p data-guide-block-label="How to delegate a multi-source task" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779819023849-bi5kjp"&gt; - [Tool 3]: [what to look for there]&lt;/p&gt;&lt;p data-guide-block-label="How to delegate a multi-source task" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779819023849-bi5kjp"&gt;Output format: [describe the structure you want]&lt;/p&gt;&lt;p data-guide-block-label="How to delegate a multi-source task" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779819023849-bi5kjp"&gt;Before you start, give me a short plan: Identify the sources you will inspect, the artifact you will produce, any gaps or unknowns you anticipate, and the checks you will run before calling it done. If anything requires sending, posting, archiving, or modifying a source of truth, ask first.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;A warning about data:&lt;/strong&gt; A one-shot attempt at pulling data from multiple systems can be wrong because of stale data, mismatched definitions, permissions gaps, or join errors. For any metric that informs business decisions or agent actions, verify column by column against your primary source. The closer a number is to a source of truth, the more carefully it needs to be checked.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Make your outputs agent-readable:&lt;/strong&gt; Plans and reports you generate in Codex will be read by other people—but also, increasingly, by their agents. Write them in plain, structured language that a human can scan and an agent can query. Clear section headers, explicit decisions, and labeled action items make the artifact useful in both directions.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Move to Level 3 when:&lt;/strong&gt; You keep running the same multi-source workflow more than once a week and wishing it happened automatically.&lt;/p&gt;&lt;h3&gt;&lt;strong&gt;Level 3: Repeated chores into persistent workflows &lt;/strong&gt;&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Mental model:&lt;/strong&gt; Codex as an automated operations layer that handles predictable, recurring work so you don’t have to.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Mode: &lt;/strong&gt;Hybrid. Some tasks are fully predictable and can run without back-and-forth. These tasks are ripe for &lt;strong&gt;delegation&lt;/strong&gt;. Tasks that involve judgment, strategy, or creative decisions suit &lt;strong&gt;collaboration.&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;A useful heuristic: If you could write a checklist that covers 90 percent of the cases, delegate it. If you would need to think about it differently each time, collaborate.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;In either case, look for “computer chores”&lt;/strong&gt;—recurring tasks that take time and attention, but don’t require human judgment at every single touchpoint.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Common chore candidates:&lt;/strong&gt;&lt;/p&gt;&lt;ul&gt;&lt;li&gt;End-of-day check for unanswered Slack messages and emails, with drafted replies&lt;/li&gt;&lt;li&gt;Weekly metrics brief from analytics, revenue, and support data&lt;/li&gt;&lt;li&gt;Meeting-note cleanup and action-item extraction after each recorded call&lt;/li&gt;&lt;li&gt;Customer support pattern detection and issue routing&lt;/li&gt;&lt;li&gt;Draft-to-review package that formats a piece for editor handoff&lt;/li&gt;&lt;li&gt;Recruiting research for an open role&lt;/li&gt;&lt;/ul&gt;&lt;p data-guide-block-label="Workflow template" data-guide-block-kind="template" data-guide-block-id="guide-block-1779819137935-8no8us"&gt;Before building any persistent workflow, fill out this template. It becomes the instruction file Codex reads every time the workflow runs. (The workflows in Part 4 are each an example of this canvas applied.)&lt;/p&gt;&lt;p data-guide-block-label="Workflow template" data-guide-block-kind="template" data-guide-block-id="guide-block-1779819137935-8no8us"&gt;Workflow name:&lt;/p&gt;&lt;p data-guide-block-label="Workflow template" data-guide-block-kind="template" data-guide-block-id="guide-block-1779819137935-8no8us"&gt;Trigger or cadence:&lt;/p&gt;&lt;p data-guide-block-label="Workflow template" data-guide-block-kind="template" data-guide-block-id="guide-block-1779819137935-8no8us"&gt;Input sources:&lt;/p&gt;&lt;p data-guide-block-label="Workflow template" data-guide-block-kind="template" data-guide-block-id="guide-block-1779819137935-8no8us"&gt;Output artifact:&lt;/p&gt;&lt;p data-guide-block-label="Workflow template" data-guide-block-kind="template" data-guide-block-id="guide-block-1779819137935-8no8us"&gt;Approval rules:&lt;/p&gt;&lt;p data-guide-block-label="Workflow template" data-guide-block-kind="template" data-guide-block-id="guide-block-1779819137935-8no8us"&gt;What Codex may do without asking:&lt;/p&gt;&lt;p data-guide-block-label="Workflow template" data-guide-block-kind="template" data-guide-block-id="guide-block-1779819137935-8no8us"&gt;What Codex must ask before doing:&lt;/p&gt;&lt;p data-guide-block-label="Workflow template" data-guide-block-kind="template" data-guide-block-id="guide-block-1779819137935-8no8us"&gt;Verification steps:&lt;/p&gt;&lt;p data-guide-block-label="Workflow template" data-guide-block-kind="template" data-guide-block-id="guide-block-1779819137935-8no8us"&gt;Where the final output lives:&lt;/p&gt;&lt;p data-guide-block-label="Workflow template" data-guide-block-kind="template" data-guide-block-id="guide-block-1779819137935-8no8us"&gt;When to retire or revise this workflow:&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Review discipline for automated workflows:&lt;/strong&gt; Don’t review automated output inside Codex. Draft in Codex, then review in the destination app—Slack for Slack messages, Gmail for email drafts, word processors for documents. Content that looks fine in a terminal often reads differently in the tool where it’s ultimately used, and the context switch catches things a Codex review pass would miss.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Move to Level 4 when:&lt;/strong&gt; Your prompt-based workflow hits a ceiling—the task is too complex or too custom to handle in text alone, and a small script or local tool would make it reliable.&lt;/p&gt;&lt;h3&gt;&lt;strong&gt;Level 4: Build small tools when prompts are not enough&lt;/strong&gt;&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Mental model:&lt;/strong&gt; Codex as a builder that creates lightweight infrastructure to make your workflows more reliable, faster, or more repeatable.&lt;/p&gt;&lt;p&gt;Sometimes the best Codex output is a small script, a local app, a custom dashboard, or a review surface that makes a recurring workflow easier, rather than pure text.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Mode: &lt;/strong&gt;Hybrid. In some cases, Codex may generate an artifact independently for you to review and then move on. In others, the artifact it produces may become a space where you and the agent iterate together.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Examples of when a small tool helps:&lt;/strong&gt;&lt;/p&gt;&lt;ul&gt;&lt;li&gt;A recurring workflow that requires pulling from an API that has no Codex integration. A short script handles the connection reliably.&lt;/li&gt;&lt;li&gt;A review process where you need to see formatted output side by side with the source. A simple local app gives you the view.&lt;/li&gt;&lt;li&gt;A task that needs to run on a schedule without your involvement. A script set to run on a timer (a cron job) handles the timing.&lt;/li&gt;&lt;li&gt;A workflow that accumulates structured data over time. A lightweight database or structured file tracks it persistently.&lt;/li&gt;&lt;/ul&gt;&lt;p data-guide-block-kind="callout" data-guide-block-id="guide-block-1779820329598-tdqexj"&gt;&lt;strong&gt;Practical approach for non-engineers:&lt;/strong&gt;&lt;/p&gt;&lt;ol&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1779820329598-tdqexj"&gt;Run the task manually in Codex once to confirm the output is what you want&lt;/li&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1779820329598-tdqexj"&gt;Ask Codex: “Which steps in this workflow could be made more reliable with a small script or tool?”&lt;/li&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1779820329598-tdqexj"&gt;Have Codex prototype the tool and explain what it does in plain language&lt;/li&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1779820329598-tdqexj"&gt;Run it on your data and verify the output matches what the manual process produced&lt;/li&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1779820329598-tdqexj"&gt;Keep only the parts that reduce friction. Discard what adds complexity without benefit.&lt;/li&gt;&lt;/ol&gt;&lt;p&gt;You don’t need to understand every line of code to use a tool Codex built. You do need to understand what data it touches, what it produces, and where the review step is. If you can’t explain those three things, the tool isn’t ready to run autonomously.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Move to level 5 when: &lt;/strong&gt;You give Codex the same feedback repeatedly and have standing preferences that you’d prefer it to apply on its own.&lt;/p&gt;&lt;h3&gt;&lt;strong&gt;Level 5: Compound your Codex system&lt;/strong&gt;&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Mental model:&lt;/strong&gt; Codex as a system that can improve over time when you save useful workflows, maintain review rules, and use memories or skills to codify preferences where available.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Mode: &lt;/strong&gt;Hybrid. Some instructions will dictate how the agent approaches autonomous work; others will guide how the model interacts with you in collaboration mode.&lt;/p&gt;&lt;p&gt;The idea of “compounding” work comes from &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/guides/compound-engineering" rel="noopener noreferrer" target="_blank"&gt;compound engineering&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;, the AI-native coding methodology coined by &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@kieran_1355" rel="noopener noreferrer" target="_blank"&gt;Kieran Klaassen&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; and &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@nityesh" rel="noopener noreferrer" target="_blank"&gt;Nityesh Agarwal&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; while building &lt;strong&gt;&lt;u&gt;&lt;a href="http://cora.computer" rel="noopener noreferrer" target="_blank"&gt;Cora&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;, Every’s email client. The canonical example is a product requirements document (PRD) that writes the scaffolding for the next one: The artifact you produce becomes the tool that speeds up the next round. The four habits below are how you put it into practice as a knowledge worker, not just an engineer.&lt;/p&gt;&lt;p&gt;Remember: &lt;strong&gt;Each useful session should make future sessions faster and more reliable. &lt;/strong&gt;In practice, that requires doing four things consistently after completing any significant piece of work:&lt;/p&gt;&lt;p&gt;&lt;strong&gt;1. Save successful prompts as workflow files.&lt;/strong&gt; When a prompt produces exactly the right output, document it. Write down the input sources, the exact prompt, the output format, and the review step. Save it in your workflows/ folder. The next time you need the same output, the agent will have that reference to work from.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;2. Add mistakes to review checklists.&lt;/strong&gt; When Codex gets something wrong—a number that was off, a tone that missed the mark, or an assumption it should not have made—add a specific check to your relevant review file, and instruct Codex to check its work against those guardrails.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;3. Update your context files after major projects.&lt;/strong&gt; When a project ends, update context.md to reflect what changed—new priorities, new tools, what worked, and what didn’t. Codex can use this when you point it to the file, turn it into a skill/workflow, or store the pattern in Codex memory where available.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;4. Ask Codex to identify compounding opportunities.&lt;/strong&gt; At the end of any session where you did something useful, run this prompt:&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820367058-c271lt"&gt;Based on what we just did, what parts of this workflow should become a reusable skill, an automation, or a small tool? What context should I add to my project files so we don’t have to re-establish this next time?&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Forking for your discipline:&lt;/strong&gt; The &lt;u&gt;&lt;a href="https://github.com/EveryInc/compound-engineering-plugin" rel="noopener noreferrer" target="_blank"&gt;compound engineering plugin&lt;/a&gt;&lt;/u&gt;, Every’s open-source system for structured agent workflows, installable in Codex with one command, works for knowledge work out of the box, but its review agents are optimized for coding needs like establishing frontend patterns and reviewing for code performance.&lt;/p&gt;&lt;p&gt;Knowledge workers can fork it into a version with reviewers tuned for strategic alignment, data accuracy, writing quality, and communication standards. A forked version, &lt;u&gt;&lt;a href="https://github.com/EveryInc/compound-knowledge-plugin" rel="noopener noreferrer" target="_blank"&gt;compound knowledge&lt;/a&gt;&lt;/u&gt;, is publicly available on Every’s GitHub, and is designed to be readable and editable by non-engineers.&lt;/p&gt;&lt;h2&gt;&lt;strong&gt;Part 4: Workflow library&lt;/strong&gt;&lt;/h2&gt;&lt;p&gt;These workflows are meant as inspiration to get you started. Adapt the inputs, outputs, and approval rules to your specific tools and standards.&lt;/p&gt;&lt;h3&gt;&lt;strong&gt;1. Inbox zero review queue&lt;/strong&gt;&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Best for:&lt;/strong&gt; Anyone whose email backlog is a recurring source of anxiety or dropped balls.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Input sources:&lt;/strong&gt; Gmail or your email client of choice.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Output artifact:&lt;/strong&gt; A structured list of draft replies, proposed actions (archive, delegate, flag), and any emails flagged for your personal attention because the draft alone isn’t sufficient.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@danshipper" rel="noopener noreferrer" target="_blank"&gt;Dan Shipper&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; kept inbox zero for 10 days straight with Codex. To use this workflow, have Codex: &lt;/p&gt;&lt;ol&gt;&lt;li&gt;Gather email through &lt;u&gt;&lt;a href="https://cora.computer/" rel="noopener noreferrer" target="_blank"&gt;Cora&lt;/a&gt;&lt;/u&gt; running in the in-app browser.&lt;/li&gt;&lt;li&gt;Render the email queue as a single page.&lt;/li&gt;&lt;li&gt;Go through each item with you as you dictate the action the AI should take (e.g., “research this,” “draft that,” “pull the documents our lawyers asked for.”) You can do this via chat or voice with a dictation tool like &lt;strong&gt;&lt;u&gt;&lt;a href="https://www.monologue.to/" rel="noopener noreferrer" target="_blank"&gt;Monologue&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; (we recommend the latter). &lt;/li&gt;&lt;/ol&gt;&lt;p&gt;&lt;strong&gt;First prompt:&lt;/strong&gt;&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820476286-abtpfb"&gt;	Go through my inbox for the past [time period].&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820476286-abtpfb"&gt;	For each email that needs a response or action:&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820476286-abtpfb"&gt;1. Categorize it: needs reply/needs action/can archive/already handled&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820476286-abtpfb"&gt;2. If it needs a reply, draft one in my voice using the style in &lt;code&gt;preferences.md&lt;/code&gt;&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820476286-abtpfb"&gt;3. If it needs action, describe the action clearly&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820476286-abtpfb"&gt;4. Flag any email where a draft reply isn’t enough—where I need to think about this personally before responding&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820476286-abtpfb"&gt;	Don’t send anything. Create drafts only. I will review in Gmail.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Review step:&lt;/strong&gt; Review all drafts in Gmail before sending. Don’t approve from inside Codex.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;How to compound:&lt;/strong&gt; After a few sessions, add a rule file describing your categorization preferences—which senders always get priority, which topics can be archived without reply, and which types of requests need a human-written response. &lt;/p&gt;&lt;h3&gt;&lt;strong&gt;2. Daily unanswered message roundup&lt;/strong&gt;&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Best for:&lt;/strong&gt; Anyone who communicates across Slack, email, and other channels and loses track of what still needs a response.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Input sources:&lt;/strong&gt; Slack, Gmail, any other communication tool you use.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Output artifact:&lt;/strong&gt; A list of unanswered items with drafted replies or proposed reactions, organized by urgency.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;First prompt:&lt;/strong&gt;&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820496080-1lm39w"&gt;Look across my Slack and Gmail for the past 24 hours. Find everything that was directed at me that I have not responded to.&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820496080-1lm39w"&gt;	For each item:&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820496080-1lm39w"&gt;1. Draft a reply or suggest a reaction (thumbs up, etc.) if a short acknowledgment is appropriate&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820496080-1lm39w"&gt;2. Flag items where a more considered response is needed3. Flag anything time-sensitive&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820496080-1lm39w"&gt;	Present the list organized by urgency. Don’t send anything.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Review step:&lt;/strong&gt; Review in Slack and Gmail.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;How to compound:&lt;/strong&gt; After a few runs, save a rules file specifying which Slack channels are high-priority, which senders always warrant a human response, and which types of messages can be handled with a reaction rather than a reply.&lt;/p&gt;&lt;h3&gt;&lt;strong&gt;3. Research brief creation&lt;/strong&gt;&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Best for:&lt;/strong&gt; Anyone preparing for a meeting, a pitch, a content piece, or a strategic decision and needing a thorough, sourced summary of a topic.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Input sources:&lt;/strong&gt; Provided links, Notion, Drive, web search.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Output artifact:&lt;/strong&gt; A structured brief with background, key facts, open questions, and source links.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;First prompt:&lt;/strong&gt;&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820511227-dhvhjc"&gt;	Build a research brief on [topic]. &lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820511227-dhvhjc"&gt;	Sources to prioritize: [List any specific links, documents, or databases].&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820511227-dhvhjc"&gt;	Structure the brief as:&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820511227-dhvhjc"&gt;	- Background: what I need to know to have a smart conversation about this&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820511227-dhvhjc"&gt;	- Key facts and data points, each with a source link&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820511227-dhvhjc"&gt;	- Competing perspectives or significant disagreements in the field&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820511227-dhvhjc"&gt;	- Open questions I should be able to answer before [meeting/decision/deadline]&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820511227-dhvhjc"&gt;	- Three things I should read next if I want to go deeper&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820511227-dhvhjc"&gt;Flag any claims you are less than confident about.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Review step:&lt;/strong&gt; Check source links. Verify any statistics against the original source before using them.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;How to compound:&lt;/strong&gt; Save a brief template in your workflows/ folder. After each brief, add any recurring sources (newsletters, databases, key authors) to your sources/key-links.md so Codex checks them by default.&lt;/p&gt;&lt;h3&gt;&lt;strong&gt;4. Writing with a parallel review loop&lt;/strong&gt;&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Best for:&lt;/strong&gt; Writers who want Codex running alongside them as they draft—checking the work, flagging issues, and responding in parallel without interrupting the writing session.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Input sources:&lt;/strong&gt; Your draft (open in your word processor through Codex’s in-app browser), any relevant style guides, source documents, or review standards in your workspace.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Output artifact:&lt;/strong&gt; An annotated draft with inline feedback, flagged issues, and suggested revisions—produced continuously as you write rather than in a single pass at the end.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Setup: &lt;/strong&gt;Open your draft in Proof or the in-app browser. Start a Codex session with your workspace context loaded. Give Codex standing instructions for what to monitor and how to respond.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;First prompt:&lt;/strong&gt;&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820529028-05kfk1"&gt;	I am writing [describe the piece—type, audience, purpose].&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820529028-05kfk1"&gt;As I draft, run a continuous review loop. Check for:&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820529028-05kfk1"&gt;- Claims that need a source or are stated with more confidence than the evidence supports&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820529028-05kfk1"&gt;- Passages where the argument loses clarity or the logic has a gap&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820529028-05kfk1"&gt;- Sentences that violate the style preferences in &lt;u&gt;&lt;a href="http://preferences.md" rel="noopener noreferrer" target="_blank"&gt;preferences.md&lt;/a&gt;&lt;/u&gt;&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820529028-05kfk1"&gt;- Anything that reads as filler, throat-clearing, or AI-generated phrasing&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820529028-05kfk1"&gt;Don’t rewrite anything without being asked. Flag issues as I go with a brief note on what the problem is and what would fix it. Check in every [X minutes / X paragraphs] or when I ask.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Review step:&lt;/strong&gt; Read the flagged issues at natural stopping points—the end of a section or session. Decide which to address and which to dismiss. Don’t let the feedback loop interrupt the drafting flow; the value is in the accumulation, not in responding to every flag in real time.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;How to compound:&lt;/strong&gt; After each writing session, add any recurring flags to your reviews/writing-checklist.md. Patterns that come up repeatedly are candidates for a standing rule in your preferences file, so Codex catches them automatically next time.&lt;/p&gt;&lt;h3&gt;&lt;strong&gt;5. Source management for research&lt;/strong&gt;&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Best for:&lt;/strong&gt; Writers and researchers who need to organize source material before drafting.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Input sources:&lt;/strong&gt; Links, PDFs, past drafts, notes, transcripts.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Output artifact:&lt;/strong&gt; A structured document with the core argument, supporting evidence organized by claim, counterarguments, and a gap analysis (what is still missing).&lt;/p&gt;&lt;p&gt;&lt;strong&gt;First prompt:&lt;/strong&gt;&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820559583-q3ha4f"&gt;	I am writing a piece on [topic]. The core argument I want to make is [argument].&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820559583-q3ha4f"&gt;	Here are my source materials: [links/documents].&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820559583-q3ha4f"&gt;	Build an evidence room that:&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820559583-q3ha4f"&gt;1. States the core argument clearly&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820559583-q3ha4f"&gt;	2. Lists the strongest supporting evidence for each main point, with source links&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820559583-q3ha4f"&gt;	3. Lists the strongest counterarguments and how I might address them&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820559583-q3ha4f"&gt;	4. Identifies any gaps—claims I am making that lack strong evidence&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820559583-q3ha4f"&gt;	5. Flags any sources that conflict with each other&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Review step:&lt;/strong&gt; Read the evidence room before drafting. Verify any statistics or quotes you plan to use directly.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;How to compound:&lt;/strong&gt; Save the evidence format as a workflow template. Add a standing note to your context file about your writing voice and recurring themes so Codex calibrates its framing.&lt;/p&gt;&lt;h3&gt;&lt;strong&gt;6. Information via audio&lt;/strong&gt;&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Best for:&lt;/strong&gt; Anyone who processes information better by listening than reading, or who wants to take time away from a screen but stay on top of work.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Input sources:&lt;/strong&gt; Any written content: drafts, research briefs, meeting summaries, strategy documents, reports, lengthy emails, articles.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Output artifact:&lt;/strong&gt; An audio file saved to a location accessible from your phone (Dropbox, Drive, etc.).&lt;/p&gt;&lt;p&gt;&lt;strong&gt;First prompt:&lt;/strong&gt;&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820573319-wgfep4"&gt;Convert the attached [document/draft/report] into a clear audio file. Read it at a natural pace—not rushed, not slow. Save it to [Dropbox/Drive location] as [filename].&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820573319-wgfep4"&gt;&lt;strong&gt;Review step:&lt;/strong&gt; Listen on your commute, walk, or wherever you have time away from a screen. Take notes on your phone as things come up. Return to the source material with whatever you noticed.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;How to compound:&lt;/strong&gt; Add a standing instruction to your context file covering your audio preferences—such as speed, file format, naming convention, and preferred save location—so you do not have to specify each time. You can also prompt Codex to convert content automatically at the end of certain workflows: “After generating the weekly metrics report, convert it to audio and save to [location].”&lt;/p&gt;&lt;h3&gt;&lt;strong&gt;7. Go-to-market plan generator&lt;/strong&gt;&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Best for:&lt;/strong&gt; Anyone responsible for launching a product, feature, or initiative and who has done the thinking in meetings and Slack but has not had time to formalize it.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Input sources:&lt;/strong&gt; Meeting transcripts, Slack threads, customer notes, a preferred strategy template.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Output artifact:&lt;/strong&gt; A complete go-to-market plan, structured for human review and agent querying.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;First prompt:&lt;/strong&gt;&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820595232-4q50ag"&gt;	Build a go-to-market plan for [product/initiative]. &lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820595232-4q50ag"&gt;	Sources to pull from:&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820595232-4q50ag"&gt;	- Meeting transcripts: [Notion location or links]&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820595232-4q50ag"&gt;	- Slack discussions: [channels or search terms]&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820595232-4q50ag"&gt;	- Customer research: [document or location]&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820595232-4q50ag"&gt;	- Template to follow: [link or paste template]&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820595232-4q50ag"&gt;The plan should be readable by a human in five minutes and structured so that an agent can answer specific questions about it (e.g., “What is the target ICP?” “What is the launch timeline?”).&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820595232-4q50ag"&gt;Start with a compound engineering brainstorm step. Give me a draft in Proof or Notion. Flag anything in the plan you added that was not in the source material—I only want synthesis of what we have already decided, not new suggestions baked in.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Review step:&lt;/strong&gt; Review in Notion or Proof. Verify that every major claim traces to something in the source material. Anything the model added that was not in your sources should be flagged for your decision.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;How to compound:&lt;/strong&gt; Save the template and the prompt. After each launch, add a retrospective note to your context file about what the plan got right and wrong. Future plans will be calibrated by past ones.&lt;/p&gt;&lt;h3&gt;&lt;strong&gt;8. KPI report&lt;/strong&gt;&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Best for:&lt;/strong&gt; Anyone responsible for tracking metrics and needing a regular, reliable view across multiple data sources.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Input sources:&lt;/strong&gt; Analytics (PostHog, Mixpanel, Amplitude), revenue data (Stripe), support volume, social metrics, saved past reports.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Output artifact:&lt;/strong&gt; A one-page report covering headlines, usage metrics, system health, and follow-up items.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;First prompt:&lt;/strong&gt;&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820607666-pyt5wv"&gt;	Generate a product pulse report for [time period].&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820607666-pyt5wv"&gt;	Data sources:&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820607666-pyt5wv"&gt;	- Product analytics: [tool and what to pull]&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820607666-pyt5wv"&gt;	- Revenue: [tool and what to pull]&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820607666-pyt5wv"&gt;	- Support: [tool and what to pull]&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820607666-pyt5wv"&gt;	- Social: [tool and what to pull]&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820607666-pyt5wv"&gt;	Structure:&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820607666-pyt5wv"&gt;	1. Headlines (three to five bullets summarizing what matters most)&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820607666-pyt5wv"&gt;2. Usage (primary engagement metric, value-realization metric, conversions, deltas vs. prior period)&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820607666-pyt5wv"&gt;3. System health (error rates, latency, top error signatures)&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820607666-pyt5wv"&gt;	4. Follow-ups (one to five things worth investigating, specific enough to act on)&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820607666-pyt5wv"&gt;Flag any number that differs significantly from the prior report. If something is anomalous, investigate one level deeper before including it.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Review step:&lt;/strong&gt; Verify every number in the report against its source. Don’t use this report as a business source of truth until you have confirmed accuracy column by column. In practice, one-shot metrics pulls are often five to 10 percent off—a common result of definition mismatches and join errors across multi-source pulls.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;How to compound:&lt;/strong&gt; Save each report as a dated file in your outputs/reports/ folder. Over time, Codex can compare reports, identify trends, and flag when something has changed. The folder becomes the working memory of your product.&lt;/p&gt;&lt;h3&gt;&lt;strong&gt;9. Customer support for product work&lt;/strong&gt;&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Best for:&lt;/strong&gt; Teams where support patterns should feed into product decisions and small fixes.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Input sources:&lt;/strong&gt; Support platform (Intercom, Zendesk), issue tracker (Linear, GitHub Issues).&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Output artifact:&lt;/strong&gt; A deduplicated list of issues with suggested priority, plus small issues ready to hand off for fixes.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;First prompt:&lt;/strong&gt;&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820617261-k37uky"&gt;	Go through my support queue for the past [time period]. &lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820617261-k37uky"&gt;	For each support thread:&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820617261-k37uky"&gt;	1. Identify the underlying issue or request.&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820617261-k37uky"&gt;	2. Check whether a similar issue already exists in [Linear/GitHub Issues].&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820617261-k37uky"&gt;	3. If it does, link them. If it doesn’t, draft a new issue.&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820617261-k37uky"&gt;	4. Flag any issue that appears more than [threshold] times—these are priorities.&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820617261-k37uky"&gt;	5. For issues that appear straightforward to fix, note that they are candidates &lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820617261-k37uky"&gt;	   for direct implementation.&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820617261-k37uky"&gt;	Don’t create issues in the tracker yet. Give me the list to review first.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Review step:&lt;/strong&gt; Review the issue list before anything goes into the tracker. Confirm deduplication is accurate—support tickets often describe the same underlying problem in different words.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;How to compound:&lt;/strong&gt; After each session, add a note about recurring issue types so Codex can categorize faster next time. Build a persistent list of known issues so deduplication improves over time.&lt;/p&gt;&lt;h3&gt;&lt;strong&gt;10. Pull requests for non-engineers&lt;/strong&gt;&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Best for:&lt;/strong&gt; Anyone who needs to make a small, well-scoped change to a codebase—such as copy updates, configuration changes, or content edits—without deep engineering knowledge.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Input sources:&lt;/strong&gt; The relevant files or repository, and a clear description of the change.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Output artifact:&lt;/strong&gt; A pull request (PR) that is reviewer-friendly and doesn’t touch anything outside the intended scope.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;First prompt:&lt;/strong&gt;&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820628006-6q08r6"&gt;	I need to make the following change: [describe the change clearly].&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820628006-6q08r6"&gt;	Before making any changes:&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820628006-6q08r6"&gt;1. Show me which files are affected&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820628006-6q08r6"&gt;2. Confirm the scope of the change—nothing outside these files should be touched&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820628006-6q08r6"&gt;	3. Explain what you are going to do in plain language before doing it&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820628006-6q08r6"&gt;	After making the change:&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820628006-6q08r6"&gt;	1. Summarize what was changed and why&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820628006-6q08r6"&gt;	2. List every file that was touched&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820628006-6q08r6"&gt;	3. Explain how you verified the change is correct&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820628006-6q08r6"&gt;	4. Flag anything a reviewer should look at carefully&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820628006-6q08r6"&gt;	Make the smallest useful change. Don’t refactor or improve anything adjacent.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Review step:&lt;/strong&gt; Review the Codex preview before the PR is opened. Review the PR itself in GitHub or your code review tool. Ask a technical colleague to approve before merging if you are uncertain.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;How to compound:&lt;/strong&gt; Save a template of your preferred PR format. After each PR, add a note about anything that requires correction so future PRs avoid the same issue.&lt;/p&gt;&lt;h3&gt;&lt;strong&gt;11. Recruiting research&lt;/strong&gt;&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Best for:&lt;/strong&gt; Anyone doing outbound recruiting for a role with a specific background profile.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Input sources:&lt;/strong&gt; LinkedIn, Twitter/X, company websites, alumni databases, public professional networks.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Output artifact:&lt;/strong&gt; A list of candidates with background summaries and contact information or connection points.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;First prompt:&lt;/strong&gt;&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820638904-lqg7gg"&gt;	I am hiring for [role]. The ideal candidate has [background profile—experience, &lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820638904-lqg7gg"&gt;	prior companies, skills, career trajectory].&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820638904-lqg7gg"&gt;	Search for candidates who match this profile. For each candidate:&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820638904-lqg7gg"&gt;	1. Summarize their background in two to three sentences&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820638904-lqg7gg"&gt;	2. Note why they match the profile&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820638904-lqg7gg"&gt;	3. Identify any connection point (mutual connections, follows, shared affiliations)&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820638904-lqg7gg"&gt;	4. Provide a link to their public profile&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820638904-lqg7gg"&gt;	Return the top [number] candidates, ranked by how closely they match the profile.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Review step:&lt;/strong&gt; Review each candidate before any outreach. Verify that the background summaries are accurate by checking the linked profiles. Don’t send any outreach through Codex.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;How to compound:&lt;/strong&gt; Save the role profile as a template. After a successful hire, document what the actual background looked like versus the initial profile to calibrate future searches.&lt;/p&gt;&lt;h3&gt;&lt;strong&gt;12. Strategy and planning agent&lt;/strong&gt;&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Best for:&lt;/strong&gt; Leaders and operators who need to compress OKR planning, quarterly planning, or strategic reviews from days to hours.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Input sources:&lt;/strong&gt; Past planning documents, meeting transcripts, leadership context notes, relevant metrics.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Output artifact:&lt;/strong&gt; A draft plan or OKR set, structured for review and iteration.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;First prompt:&lt;/strong&gt;&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820656796-4vc62h"&gt;	I need to draft [quarterly plan / OKR set / strategic review] for [scope].&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820656796-4vc62h"&gt;	Pull from:&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820656796-4vc62h"&gt;	- Past plans: [location]&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820656796-4vc62h"&gt;	- Recent meeting transcripts: [location]&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820656796-4vc62h"&gt;	- Current metrics: [location or description]&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820656796-4vc62h"&gt;	- Leadership context: [document or description]&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820656796-4vc62h"&gt;	Structure the output as [desired format].&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820656796-4vc62h"&gt;Flag any goal or initiative you are recommending that doesn’t have explicit support in the source material. I want synthesis of what has already been decided, not new recommendations baked in without my review.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Review step:&lt;/strong&gt; Review in Notion or Proof. Before sharing with leadership or the team, confirm that every major commitment traces to a decision that was actually made.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;How to compound:&lt;/strong&gt; After each planning cycle, add a retrospective to your context file. Did the goals prove achievable? What was missing from the original plan? Future planning sessions will be informed by past ones.&lt;/p&gt;&lt;h3&gt;&lt;strong&gt;13. Personal learning tool&lt;/strong&gt;&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Best for:&lt;/strong&gt; Anyone who wants to use Codex to support skill-building, practice, or self-directed learning.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Input sources:&lt;/strong&gt; External APIs, files, structured practice materials, your own notes.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Output artifact:&lt;/strong&gt; A custom interactive tool—like a tutor, a quiz, or a practice environment—built for your learning goal.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Example: &lt;/strong&gt;A musician wants to practice chord identification. They connect a MIDI keyboard and describe what they want, and Codex builds a small app that listens to what they play, identifies the chord, and tracks progress over time. &lt;/p&gt;&lt;p&gt;&lt;strong&gt;First prompt:&lt;/strong&gt;&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820674573-yry42c"&gt;	I want to build a personal learning tool for [skill or subject]. &lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820674573-yry42c"&gt; 	My current level: [beginner/intermediate/what I know already].&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820674573-yry42c"&gt;	What I want to practice: [specific aspect of the skill].&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820674573-yry42c"&gt;	How I want feedback: [immediate/after each session/scored].&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820674573-yry42c"&gt;	Build a prototype I can use locally. Explain what it does and how to use it before I start.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Review step:&lt;/strong&gt; Try the tool on real practice material before committing to it. Verify it is actually testing what you intended.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;How to compound:&lt;/strong&gt; After each practice session, ask Codex to update the tool based on what you found most and least useful. The tool improves as your needs become clearer.&lt;/p&gt;&lt;h2&gt;&lt;strong&gt;Part 5: Operating Codex well&lt;/strong&gt;&lt;/h2&gt;&lt;h3&gt;&lt;strong&gt;How to Steer Codex&lt;/strong&gt;&lt;/h3&gt;&lt;p&gt;Operating Codex well is management work. You evaluate talent (which prompts, agents, and workflows to trust), set vision (what to point Codex at, and what “done” should look like), exercise taste (catching output that is technically correct but wrong for the moment), and know when to let be or take the wheel.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Give Codex an outcome.&lt;/strong&gt; Describe what you want to end up with, not how to get there. “Build a research brief on [topic] with these sources and this structure” produces better results than “First search Slack, then search Notion, then...”&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Ask for a plan before long-running work.&lt;/strong&gt; For any task that will take more than a few minutes or touch multiple systems, ask Codex to explain what it’s about to do before it starts. This catches misunderstandings early and gives you a chance to redirect it before it gets too far along.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Ask Codex what it needs before it starts.&lt;/strong&gt; For complex tasks, a short briefing prompt saves time:&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820795697-av5ui3"&gt;Before you start, tell me what additional context would help you do this better. What are the most important things you would want to know?&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Require citations and audit trails for important claims.&lt;/strong&gt; Any document that will be shared or used for decisions should have source links for factual claims. Make this a standing rule in your preferences file.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Don’t over-manage every micro-step once the plan is good.&lt;/strong&gt; Once you have confirmed the approach, let Codex work. Interrupting undermines autonomous operation and produces worse results than reviewing the completed output.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Review in the destination app.&lt;/strong&gt; Always.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Set explicit no-send /no-post/no-archive/no-modify rules in your rules file.&lt;/strong&gt; These should apply by default to any sensitive workflow. Make Codex ask before taking any action that can’t easily be undone.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Three questions to ask before approving any significant output:&lt;/strong&gt;&lt;/p&gt;&lt;p data-guide-block-kind="callout" data-guide-block-id="guide-block-1779820810918-pg97kr"&gt;What was the hardest decision you made in producing this?&lt;/p&gt;&lt;p data-guide-block-kind="callout" data-guide-block-id="guide-block-1779820810918-pg97kr"&gt;What alternatives did you consider and reject?&lt;/p&gt;&lt;p data-guide-block-kind="callout" data-guide-block-id="guide-block-1779820810918-pg97kr"&gt;Where are you least confident?&lt;/p&gt;&lt;p&gt;These questions surface the judgment calls the model made, the options it dismissed, and the places most likely to contain errors.&lt;/p&gt;&lt;h3&gt;&lt;strong&gt;Safety, trust, and risks&lt;/strong&gt;&lt;/h3&gt;&lt;h4&gt;&lt;strong&gt;Risk categories&lt;/strong&gt;&lt;/h4&gt;&lt;p&gt;&lt;strong&gt;Green—proceed with standard review:&lt;/strong&gt; Summaries, outlines, internal drafts, research briefs, personal notes, low-stakes scripts.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Yellow—review carefully before sharing or acting:&lt;/strong&gt; Strategy documents, customer-support drafts, product specs, recruiting research, non-destructive data pulls, PR drafts for small changes.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Red—don’t proceed without explicit human verification:&lt;/strong&gt; Sending messages to clients or customers, changing source-of-truth data, making production code changes, moving money, legal or compliance claims, unreconciled metrics used for business decisions.&lt;/p&gt;&lt;h4 data-guide-block-kind="callout" data-guide-block-id="guide-block-1779820870615-20fyx4"&gt;&lt;strong&gt;Common failure modes and how to handle them&lt;/strong&gt;&lt;/h4&gt;&lt;p data-guide-block-kind="callout" data-guide-block-id="guide-block-1779820870615-20fyx4"&gt;&lt;em&gt;Confident wrongness.&lt;/em&gt; Codex can state incorrect facts with high confidence. For any factual claim that matters, verify against the source. Never pass a statistic or claim to another person without checking it.&lt;/p&gt;&lt;p data-guide-block-kind="callout" data-guide-block-id="guide-block-1779820870615-20fyx4"&gt;&lt;em&gt;Metrics errors.&lt;/em&gt; Joining data from multiple sources introduces definition mismatches and calculation errors. Verify column by column for any metric used in decisions.&lt;/p&gt;&lt;p data-guide-block-kind="callout" data-guide-block-id="guide-block-1779820870615-20fyx4"&gt;&lt;em&gt;Out-of-scope changes.&lt;/em&gt; Codex sometimes modifies files or makes improvements adjacent to the task you assigned. Review the changes line by line (called a “diff”), not just the final output, especially for any task involving code.&lt;/p&gt;&lt;p data-guide-block-kind="callout" data-guide-block-id="guide-block-1779820870615-20fyx4"&gt;&lt;em&gt;Automations that break.&lt;/em&gt; Persistent workflows stop working when tools update their APIs, credentials expire, or context files become stale. Every automation needs an owner who checks it periodically. Sever that connection—stop tending it—and the agent stops being useful. “Set it and forget it” isn’t a stable operating mode.&lt;/p&gt;&lt;p data-guide-block-kind="callout" data-guide-block-id="guide-block-1779820870615-20fyx4"&gt;&lt;em&gt;Plugin and integration failure.&lt;/em&gt; Plugins and integrations need maintenance: Permissions expire, APIs change, configurations need updates, and some changes require restarting Codex. Integration failures—particularly with Notion and Gmail—happen and aren’t always obvious. If a workflow produces strange output, check whether the connection is still working before assuming the prompt is wrong.&lt;/p&gt;&lt;p data-guide-block-kind="callout" data-guide-block-id="guide-block-1779820870615-20fyx4"&gt;&lt;em&gt;Usage limits.&lt;/em&gt; Long-running sessions can hit usage limits and stop mid-task. For complex workflows, break work into stages rather than attempting everything in a single session.&lt;/p&gt;&lt;p data-guide-block-kind="callout" data-guide-block-id="guide-block-1779820870615-20fyx4"&gt;&lt;em&gt;Untrusted input&lt;/em&gt;. Anything Codex reads—an email, a web page, a shared document, a support ticket—can contain instructions aimed at the agent rather than at you, sometimes hidden from human eyes. If Codex is browsing untrusted sites or processing external messages while holding broad write access, those buried instructions can turn into actions—like sending data where it shouldn’t go. So keep destructive actions behind approval, and scope each workflow to the least access it needs, so a hijacked instruction has nowhere to go.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;The human ownership standard:&lt;/strong&gt; Codex can touch any artifact in your workflow, but a human must direct the work, stand behind the output, and be able to discuss any specific decision in it. If someone asks you about a bullet point in a document Codex drafted, you should be able to answer. An AI-drafted document is fine—expected, even—but if someone talks it through with you and it’s clear you have no idea what’s in it, that’s a problem.&lt;/p&gt;&lt;h3&gt;&lt;strong&gt;Team workflows: From personal Codex to shared operating system&lt;/strong&gt;&lt;/h3&gt;&lt;p&gt;Individual Codex workflows compound over time. Team workflows compound faster but require coordination.&lt;/p&gt;&lt;h4&gt;&lt;strong&gt;What changes when a team uses Codex&lt;/strong&gt;&lt;/h4&gt;&lt;p&gt;Teams build trust in agents through the humans who operate them. When a colleague receives a document or plan that Codex drafted, they trust it to the degree they trust the person who shared it.&lt;/p&gt;&lt;h4&gt;&lt;strong&gt;Infrastructure that makes team Codex work&lt;/strong&gt;&lt;/h4&gt;&lt;p&gt;&lt;em&gt;Shared review surfaces.&lt;/em&gt; A shared document review tool (Proof, Notion, Google Docs) makes agent-generated documents easier to inspect and comment on than outputs reviewed only inside Codex.&lt;/p&gt;&lt;p&gt;&lt;em&gt;Codex-mediated routing. &lt;/em&gt;Teams can combine Codex threads, automations, Slack or GitHub integrations, remote connections, and app-server APIs to build routing workflows: Requests arrive in Slack, email, or another shared intake surface; Codex helps triage them, creates reviewable tasks or drafts, and routes the work to the right human or Codex workspace for execution. Each route needs clear ownership, permissions, review rules, and a source of truth. For teams doing a lot of cross-functional requests, such as legal reviews, data pulls, or copy approvals, this pattern removes significant coordination overhead.&lt;/p&gt;&lt;p&gt;A key mechanic to making this style of work possible is giving Codex its own email address. Codex doesn’t come with one—you set it up with a tool like &lt;u&gt;&lt;a href="https://www.nylas.com/" rel="noopener noreferrer" target="_blank"&gt;Nylas&lt;/a&gt;&lt;/u&gt; that gives an agent an inbox. Once it has that address, you can treat it like another teammate. Routes built on an email address still need the same discipline as any other: a clear owner, scoped permissions, and a review step before anything goes back out.&lt;/p&gt;&lt;p&gt;&lt;em&gt;Agent-readable shared documentation.&lt;/em&gt; Plans, strategy documents, and operational guides written for both human and agent readers become shared infrastructure. Any team member—or any team member’s agent—can query them for specific information without interrupting the author.&lt;/p&gt;&lt;p&gt;&lt;em&gt;Explicit ownership.&lt;/em&gt; Every persistent workflow needs a named owner. That person is responsible for monitoring output quality, updating the workflow when it breaks, and retiring it when it’s no longer useful. Automation degrades without ownership.&lt;/p&gt;&lt;h4 data-guide-block-kind="callout" data-guide-block-id="guide-block-1779820923832-39cry6"&gt;A simple way to get a team to use Codex&lt;/h4&gt;&lt;p data-guide-block-kind="callout" data-guide-block-id="guide-block-1779820923832-39cry6"&gt;Don’t try to convert everyone. As a rule of thumb, a tenth of any team will adopt a new tool no matter what, a tenth never will, and the other 80 percent come along once someone shows them how it helps their own job. Aim at that 80 percent. Three things, done together, help along adoption:&lt;/p&gt;&lt;ol&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1779820923832-39cry6"&gt;A note from a leader that makes using AI the expectation, not a nice-to-have&lt;/li&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1779820923832-39cry6"&gt;A weekly meeting where anyone can show a prompt or workflow they’ve built&lt;/li&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1779820923832-39cry6"&gt;A regular message that names the people whose work stood out &lt;/li&gt;&lt;/ol&gt;&lt;p data-guide-block-kind="callout" data-guide-block-id="guide-block-1779820923832-39cry6"&gt;Set the expectation, give people a place to share what works, and recognize them for it—that’s most of the battle.&lt;/p&gt;&lt;h2&gt;&lt;strong&gt;Part 6: Getting started&lt;/strong&gt;&lt;/h2&gt;&lt;h3&gt;&lt;strong&gt;The seven-day Codex power-user plan &lt;/strong&gt;&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Day 1: Connect and inspect.&lt;/strong&gt; Install the Codex desktop app. Connect your primary tools—Gmail, Slack, Notion, Drive, and any analytics or support tools you use. Run the workflow discovery prompt from Part 2 and review the three automation suggestions Codex returns. Don’t build anything yet. Just read the suggestions and identify which one is most useful.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Day 2: Create your context files.&lt;/strong&gt; Create your codex-workspace/ folder. Write context.md, preferences.md, and rules.md. Keep each one to one page. The goal is to capture the most important things Codex should know about you—not to be exhaustive.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Day 3: Run three one-off tasks.&lt;/strong&gt; Choose one summary task, one research brief, and one draft or plan. Use the prompt patterns from Level 1. Review each output carefully and note where Codex got things right and where it needed correction.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Day 4: Build your first workflow.&lt;/strong&gt; Take the most useful automation suggestion from Day 1 and fill out the workflow canvas from Level 3. Save it to workflows/ in your workspace. Run it once manually and verify the output.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Day 5: Add review rules.&lt;/strong&gt; Create reviews/data-checklist.md, reviews/writing-checklist.md, and reviews/comms-checklist.md. Start each one with five checks based on what you noticed during Days 3 and 4. These will grow over time.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Day 6: Turn one workflow into a reusable artifact.&lt;/strong&gt; Take the workflow from Day 4 and document the prompt, the output format, the review step, and any known edge cases. Save it as a complete workflow file. Run it again and verify the documentation is accurate.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Day 7: Compound.&lt;/strong&gt; Run the compounding prompt at the end of your Codex session:&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820956716-yzcjc0"&gt;Based on everything we have done this week, what should become a reusable skill,&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820956716-yzcjc0"&gt;an automation, or a small tool? What context should I add to my project files&lt;/p&gt;&lt;p data-guide-block-label="Prompt" data-guide-block-kind="prompt" data-guide-block-id="guide-block-1779820956716-yzcjc0"&gt;so future sessions start from a better baseline?&lt;/p&gt;&lt;p&gt;Review Codex’s suggestions and implement the one that would save the most time over the next month.&lt;/p&gt;&lt;p data-guide-block-kind="callout" data-guide-block-id="guide-block-1779820994207-1bypew"&gt;&lt;strong&gt;30-day extension:&lt;/strong&gt;&lt;/p&gt;&lt;ul&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1779820994207-1bypew"&gt;Week 1: One personal workflow running reliably&lt;/li&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1779820994207-1bypew"&gt;Week 2: One multi-source workflow pulling from at least three connected tools&lt;/li&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1779820994207-1bypew"&gt;Week 3: One small tool or automation that handles a chore without your involvement&lt;/li&gt;&lt;li data-guide-block-kind="callout" data-guide-block-id="guide-block-1779820994207-1bypew"&gt;Week 4: One shared or team workflow with explicit ownership and review cadence&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;Start today. Connect the tools you’re comfortable permissioning and ask Codex what recurring workflows it can see from the available context. That question, and what you do with the answer, is the gateway to the Codex universe.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;&lt;strong&gt;&lt;em&gt;&lt;a href="https://every.to/@katie.parrott12" rel="noopener noreferrer" target="_blank"&gt;Katie Parrott&lt;/a&gt;&lt;/em&gt;&lt;/strong&gt; &lt;em&gt;is a staff writer at Every. You can read more of her work in&lt;/em&gt; &lt;em&gt;&lt;a href="https://katieparrott.substack.com/" rel="noopener noreferrer" target="_blank"&gt;her newsletter&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;</description>
      <author>Katie Parrott and GPT  / Guides</author>
      <pubDate>2026-05-26 08:00:00 -0400</pubDate>
      <guid>https://every.to/guides/codex-for-knowledge-work</guid>
      <link>https://every.to/guides/codex-for-knowledge-work</link>
    </item>
    <item>
      <title>Cheap Competence, New Frontier</title>
      <description>&lt;table&gt;&lt;tr&gt;&lt;td&gt;&lt;img alt="Context Window" src="https://d24ovhgu8s7341.cloudfront.net/uploads/publication/logo/94/small_context_windown_1.png" /&gt;&lt;/td&gt;&lt;td&gt;&lt;/td&gt;&lt;td&gt;&lt;table&gt;&lt;tr&gt;&lt;td&gt;by &lt;a href="https://every.to/@Every%20Staff" itemprop="name"&gt;Every Staff&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;in &lt;a href="https://every.to/context-window"&gt;Context Window&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;figure&gt;&lt;img src="https://d24ovhgu8s7341.cloudfront.net/uploads/post/cover/4272/full_page_cover_538de39f5ffb9a8b-CW.png"&gt;&lt;figcaption&gt;Midjourney/Every illustration.&lt;/figcaption&gt;&lt;/figure&gt;&lt;p&gt;Hello, and happy Sunday!&lt;em&gt; &lt;/em&gt;This week we published &lt;u&gt;&lt;a href="https://every.to/p/after-automation" rel="noopener noreferrer" target="_blank"&gt;“After Automation,”&lt;/a&gt;&lt;/u&gt; &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@danshipper" rel="noopener noreferrer" target="_blank"&gt;Dan Shipper&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;’s argument that even when you automate as much as we have, there’s always a new frame for humans to hand to the models. COO &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@brandon_5263" rel="noopener noreferrer" target="_blank"&gt;Brandon Gell&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; and new head of marketing &lt;strong&gt;Douglas Brundage&lt;/strong&gt; tested the idea by moving their agent work into &lt;u&gt;&lt;a href="https://every.to/context-window/inside-the-100-agent-software-factory" rel="noopener noreferrer" target="_blank"&gt;public internal Slack channels&lt;/a&gt;&lt;/u&gt; and watching the lurkers gather. Anthropic’s reported $300 million acquisition of developer-tools &lt;u&gt;&lt;a href="https://every.to/podcast/inside-stainless-the-developer-tools-startup-anthropic-just-bought-for-300-million" rel="noopener noreferrer" target="_blank"&gt;startup Stainless&lt;/a&gt;&lt;/u&gt; rides on the same bet—that an agent can’t use a company’s API unless a human has first made it easy to use, which is what Dan&lt;strong&gt; &lt;/strong&gt;and CEO &lt;strong&gt;&lt;u&gt;&lt;a href="https://www.linkedin.com/in/alexrattray/" rel="noopener noreferrer" target="_blank"&gt;Alex Rattray&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; talked through on &lt;em&gt;&lt;u&gt;&lt;a href="https://every.to/podcast" rel="noopener noreferrer" target="_blank"&gt;AI &amp;amp; I&lt;/a&gt;&lt;/u&gt;&lt;/em&gt; months before the deal.&lt;/p&gt;&lt;p&gt;Scroll down for two takes from the ground at &lt;u&gt;&lt;a href="https://every.to/context-window/google-i-o-agents-agents-agents" rel="noopener noreferrer" target="_blank"&gt;Google I/O&lt;/a&gt;&lt;/u&gt;—&lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@jackcheng" rel="noopener noreferrer" target="_blank"&gt;Jack Cheng&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; on why Google is aiming at everyday users, not the AI crowd, and &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@AlxAi" rel="noopener noreferrer" target="_blank"&gt;Alex Duffy&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; on &lt;strong&gt;Demis Hassabis&lt;/strong&gt;’s claim that AGI is a few years out—and what Google’s been doing to take us there. Plus, a mini-Vibe Check on &lt;u&gt;&lt;a href="https://every.to/context-window/inside-the-100-agent-software-factory" rel="noopener noreferrer" target="_blank"&gt;Gas City&lt;/a&gt;&lt;/u&gt; from head of tech consulting &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@mike_2114" rel="noopener noreferrer" target="_blank"&gt;Mike Taylor&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; and a &lt;u&gt;&lt;a href="https://every.to/context-window/inside-the-100-agent-software-factory" rel="noopener noreferrer" target="_blank"&gt;Grok-based “banger classifier”&lt;/a&gt;&lt;/u&gt; &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@katie.parrott12" rel="noopener noreferrer" target="_blank"&gt;Katie Parrott&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; is running her X drafts through, and Katie’s &lt;u&gt;&lt;a href="https://every.to/working-overtime/how-to-start-a-career-when-ai-is-doing-your-entry-level-job" rel="noopener noreferrer" target="_blank"&gt;playbook&lt;/a&gt;&lt;/u&gt; for new grads facing AI-driven entry-level cuts at Meta and beyond—copy-paste career-coach prompt included. We’re off Monday for U.S. Memorial Day and back in your inbox on Tuesday.&lt;em&gt;—&lt;u&gt;&lt;a href="https://every.to/@kate_1767" rel="noopener noreferrer" target="_blank"&gt;Kate Lee&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;Was this newsletter forwarded to you? &lt;u&gt;&lt;a href="https://every.to/account" rel="noopener noreferrer" target="_blank"&gt;Sign up&lt;/a&gt;&lt;/u&gt; to get it in your inbox.&lt;/em&gt;&lt;/p&gt;&lt;h2&gt;&lt;hr class="quill-line"&gt;&lt;/h2&gt;&lt;h2&gt;Knowledge base&lt;/h2&gt;&lt;p&gt;&lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/p/after-automation" rel="noopener noreferrer" target="_blank"&gt;“After Automation”&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;&lt;em&gt; by &lt;u&gt;&lt;a href="https://every.to/@danshipper" rel="noopener noreferrer" target="_blank"&gt;Dan Shipper&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;: We’ve automated as much as possible at Every—agents write the code, draft emails, and compile the newsletter—and yet there’s more human work to do than ever. Dan’s new report traces what happens when cheap competence floods the market and argues there’s always a new frame for humans to hand the models. Read this for the case that progress expands human work rather than ending it.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/context-window/google-i-o-agents-agents-agents" rel="noopener noreferrer" target="_blank"&gt;“Google I/O: Agents, Agents, Agents”&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;&lt;em&gt; by &lt;u&gt;&lt;a href="https://every.to/@jackcheng" rel="noopener noreferrer" target="_blank"&gt;Jack Cheng&lt;/a&gt;&lt;/u&gt;/&lt;u&gt;&lt;a href="https://every.to/context-window" rel="noopener noreferrer" target="_blank"&gt;Context Window&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;: Google’s I/O keynote rebuilt search and assistants around agents—a default AI Mode, the 24/7 Gemini Spark, and a Universal Cart co-built with Amazon, Meta, and Microsoft—all on Gemini 3.5 Flash, pitched as Opus 4.7-level intelligence at four times the speed and half the cost. Read Jack Cheng’s report from the field for why Google’s I/O bets on distribution over benchmarks.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/playtesting/notes-from-the-foothills-of-the-singularity" rel="noopener noreferrer" target="_blank"&gt;“Notes From the Foothills of the Singularity”&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;&lt;em&gt; by &lt;u&gt;&lt;a href="https://every.to/@AlxAi" rel="noopener noreferrer" target="_blank"&gt;Alex Duffy&lt;/a&gt;&lt;/u&gt;/&lt;u&gt;&lt;a href="https://every.to/playtesting" rel="noopener noreferrer" target="_blank"&gt;Playtesting&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;&lt;strong&gt;:&lt;/strong&gt; At Google I/O, &lt;strong&gt;Demis Hassabis&lt;/strong&gt; placed AGI “just a few years” out and put its total impact at 10 times the Industrial Revolution. Alex Duffy frames the other side of the story through his Uber driver back from Mountain View: a 54-year-old construction worker who knows the city by heart and is worried his job is next. Read this for the tension between Google’s compute-at-scale ambitions and the workers whose ground it’s reshaping.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/context-window/inside-the-100-agent-software-factory" rel="noopener noreferrer" target="_blank"&gt;“Inside the 100-agent Software Factor&lt;/a&gt;&lt;/u&gt;y”&lt;/strong&gt;&lt;em&gt; by &lt;u&gt;&lt;a href="https://every.to/@katie.parrott12" rel="noopener noreferrer" target="_blank"&gt;Katie Parrott&lt;/a&gt;&lt;/u&gt;/&lt;u&gt;&lt;a href="https://every.to/context-window" rel="noopener noreferrer" target="_blank"&gt;Context Window&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;: Mike Taylor previewed Gas City, the successor to&lt;strong&gt; Steve Yegge&lt;/strong&gt;’s viral Gas Town—an orchestration toolkit where a persistent “mayor” agent dispatches anonymous “polecat” workers. Read this for the multi-agent engineering ideas worth internalizing even without the tool.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/working-overtime/how-to-start-a-career-when-ai-is-doing-your-entry-level-job" rel="noopener noreferrer" target="_blank"&gt;“How to Start a Career When AI Is Doing Your Entry-level Jo&lt;/a&gt;&lt;/u&gt;b”&lt;/strong&gt;&lt;em&gt; by &lt;u&gt;&lt;a href="https://every.to/@katie.parrott12" rel="noopener noreferrer" target="_blank"&gt;Katie Parrott&lt;/a&gt;&lt;/u&gt;/&lt;u&gt;&lt;a href="https://every.to/working-overtime" rel="noopener noreferrer" target="_blank"&gt;Working Overtime&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;: As Meta and other companies announce job cuts citing AI, Stanford’s Digital Economy Lab finds employment for 22-to-25-year-olds in AI-vulnerable jobs is down 13 percent since late 2022, while older workers have held steady. &lt;strong&gt;&lt;a href="https://every.to/@katie.parrott12" rel="noopener noreferrer" target="_blank"&gt;Katie Parrott&lt;/a&gt;&lt;/strong&gt; offers four moves for new grads navigating an entry-level rung that’s getting kicked out. Read this for a copy-paste career-coach prompt and the case for protecting one craft from AI.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;h2&gt;Log on&lt;/h2&gt;&lt;p&gt;We host &lt;u&gt;&lt;a href="https://every.to/events" rel="noopener noreferrer" target="_blank"&gt;camps and workshops&lt;/a&gt;&lt;/u&gt; on topics like &lt;u&gt;&lt;a href="https://www.youtube.com/watch?v=7YUBxMTF1Tc" rel="noopener noreferrer" target="_blank"&gt;compound engineering&lt;/a&gt;&lt;/u&gt; and &lt;u&gt;&lt;a href="https://www.youtube.com/watch?v=oEvjbPwGwnc" rel="noopener noreferrer" target="_blank"&gt;writing with AI&lt;/a&gt;&lt;/u&gt; to share what we’ve learned from training teams at companies like the &lt;em&gt;&lt;u&gt;&lt;a href="https://every.to/on-every/the-next-chapter-of-every-consulting" rel="noopener noreferrer" target="_blank"&gt;New York Times&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;&lt;u&gt;&lt;a href="https://every.to/on-every/the-next-chapter-of-every-consulting" rel="noopener noreferrer" target="_blank"&gt; and leading hedge funds&lt;/a&gt;&lt;/u&gt;, and by using and experimenting with AI every day ourselves.&lt;/p&gt;&lt;h5&gt;Upcoming event&lt;/h5&gt;&lt;ul&gt;&lt;li&gt;&lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/events/executive-AI-sessions" rel="noopener noreferrer" target="_blank"&gt;Executive AI Sessions&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;: On June 2, head of consulting &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@natalia_2944" rel="noopener noreferrer" target="_blank"&gt;Natalia Quintero&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; hosts a live webinar introducing &lt;u&gt;&lt;a href="https://every.to/consulting" rel="noopener noreferrer" target="_blank"&gt;Every Consulting&lt;/a&gt;&lt;/u&gt;’s new offering for leadership teams navigating AI adoption—built on the playbook we’ve been running with executive clients for months. &lt;u&gt;&lt;a href="https://every.to/events/executive-AI-sessions" rel="noopener noreferrer" target="_blank"&gt;Learn more and register&lt;/a&gt;&lt;/u&gt;.&lt;/li&gt;&lt;/ul&gt;&lt;h5&gt;In New York City&lt;/h5&gt;&lt;ul&gt;&lt;li&gt;&lt;strong&gt;&lt;u&gt;&lt;a href="https://luma.com/2o67t7ob" rel="noopener noreferrer" target="_blank"&gt;Every 🤝 IR&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;L: Join us at the Every brownstone in Brooklyn on June 3 during New York Tech Week for a subscriber-only meetup celebrating the Every community over drinks and conversation. &lt;u&gt;&lt;a href="https://luma.com/2o67t7ob" rel="noopener noreferrer" target="_blank"&gt;Learn more and RSVP&lt;/a&gt;&lt;/u&gt;.&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;h2&gt;Alignment&lt;/h2&gt;&lt;p&gt;&lt;strong&gt;Think boom, not doom. &lt;/strong&gt;At an obesity conference in Istanbul last week, two words seemed to be on everyone’s lips: GLP-1s and AI. It is hard to think of two more important technologies arriving in healthcare at the same time. GLP-1s are changing what we know about biology, and AI is changing the distribution of knowledge. I can’t even begin to imagine what the world is going to look like in the next five, 10, or 10 years. &lt;/p&gt;&lt;p&gt;Even so, a recurring question was this: What happens to the doctor-patient relationship when medical knowledge becomes abundant?&lt;/p&gt;&lt;p&gt;A growing number of patients are taking their health data, quite literally, into their own hands. They wear an Oura ring and get blood work through companies like Function Health or Superpower. They upload lab results, medical history, symptoms, medications, and sometimes even genetic data into ChatGPT or Claude. With enough context and persistence, they can generate a reasonably sophisticated view of their own health risks, possible diagnoses, or whatever else they might want to know about their biology. &lt;/p&gt;&lt;div class="quill-block-image" id="quill-block-image-1779480089726" data-source="{&amp;quot;dom_id&amp;quot;:&amp;quot;quill-block-image-1779480089726&amp;quot;,&amp;quot;link&amp;quot;:&amp;quot;https://www.stifel.com/newsletters/investmentbanking/bal/marketing/healthcare/biopharma_timopler/2025/BiopharmaMarketUpdate_091125.pdf&amp;quot;,&amp;quot;image&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4272/optimized_40792d4b-8075-446d-9cff-9ce1926d639d.png&amp;quot;,&amp;quot;caption&amp;quot;:&amp;quot;Share of U.S. consumers who have self-diagnosed using a commercially available LLM, 2023–2025. (Source: Bain, Stifel.)&amp;quot;,&amp;quot;error&amp;quot;:null}"&gt;&lt;div&gt;&lt;a href="https://www.stifel.com/newsletters/investmentbanking/bal/marketing/healthcare/biopharma_timopler/2025/BiopharmaMarketUpdate_091125.pdf" target="_blank" rel="noopener noreferrer"&gt;&lt;img src="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4272/optimized_40792d4b-8075-446d-9cff-9ce1926d639d.png" alt="Share of U.S. consumers who have self-diagnosed using a commercially available LLM, 2023–2025. (Source: Bain, Stifel.)"&gt;&lt;/a&gt;&lt;figcaption class="quill-image-caption"&gt;Share of U.S. consumers who have self-diagnosed using a commercially available LLM, 2023–2025. (Source: Bain, Stifel.)&lt;/figcaption&gt;&lt;/div&gt;&lt;/div&gt;&lt;p&gt;&lt;br&gt;&lt;/p&gt;&lt;p&gt;Two things will change about how medicine will be practiced in the next few years. &lt;/p&gt;&lt;p&gt;First, there may be fewer people utilizing primary care, especially among younger, tech-savvy patients in cities like San Francisco, New York, and Austin. Some visits that used to be driven by uncertainty may be replaced by AI-guided reassurance, self-triage, or more targeted use of labs, telehealth, and specialists. The result will be fewer low-information visits, which could be beneficial if it frees capacity for people who need in-person care most.&lt;/p&gt;&lt;p&gt;Second, when patients do see doctors, they will not come empty-handed, waiting for the physician to be the sole authority. They’ll be armed with much sharper questions. This is where Dan’s point about cheap competence becomes so important. As models commoditize medical knowledge, the value of situated judgment rises. The scarce skill becomes knowing what to do next for this particular person.&lt;/p&gt;&lt;p&gt;I am optimistic. AI does not make physicians irrelevant. It just makes excellent physicians more valuable.—&lt;em&gt;&lt;u&gt;&lt;a href="https://x.com/Ashwinreads" rel="noopener noreferrer" target="_blank"&gt;Ashwin Sharma&lt;/a&gt;&lt;/u&gt;&lt;/em&gt; &lt;/p&gt;&lt;h3&gt;&lt;hr class="quill-line"&gt;&lt;/h3&gt;&lt;p&gt;&lt;em&gt;That’s all for this week! Be sure to follow Every on X at &lt;u&gt;&lt;a href="https://twitter.com/every" rel="noopener noreferrer" target="_blank"&gt;@every&lt;/a&gt;&lt;/u&gt; and on &lt;u&gt;&lt;a href="https://www.linkedin.com/company/everyinc/" rel="noopener noreferrer" target="_blank"&gt;LinkedIn&lt;/a&gt;&lt;/u&gt;.&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;We &lt;u&gt;&lt;a href="https://every.to/studio" rel="noopener noreferrer" target="_blank"&gt;build AI tools&lt;/a&gt;&lt;/u&gt; for readers like you. Write brilliantly with &lt;/em&gt;&lt;strong&gt;&lt;em&gt;&lt;u&gt;&lt;a href="https://writewithspiral.com/" rel="noopener noreferrer" target="_blank"&gt;Spiral&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;&lt;/strong&gt;&lt;em&gt;. Organize files automatically with &lt;/em&gt;&lt;strong&gt;&lt;em&gt;&lt;u&gt;&lt;a href="https://makeitsparkle.co/?utm_source=everyfooter" rel="noopener noreferrer" target="_blank"&gt;Sparkle&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;&lt;/strong&gt;&lt;em&gt;. Deliver yourself from email with &lt;/em&gt;&lt;strong&gt;&lt;em&gt;&lt;u&gt;&lt;a href="https://cora.computer" rel="noopener noreferrer" target="_blank"&gt;Cora&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;&lt;/strong&gt;&lt;em&gt;. Dictate effortlessly with &lt;/em&gt;&lt;strong&gt;&lt;em&gt;&lt;u&gt;&lt;a href="https://monologue.to" rel="noopener noreferrer" target="_blank"&gt;Monologue&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;&lt;/strong&gt;&lt;em&gt;. Work on documents with AI agents using &lt;/em&gt;&lt;strong&gt;&lt;em&gt;&lt;u&gt;&lt;a href="https://www.proofeditor.ai/?source=post_button" rel="noopener noreferrer" target="_blank"&gt;Proof&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;&lt;/strong&gt;&lt;em&gt;.&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;For sponsorship opportunities, reach out to sponsorships@every.to.&lt;/em&gt;&lt;/p&gt;&lt;div class="quill-button" data-source="{&amp;quot;id&amp;quot;:&amp;quot;quill-button-1779480175311&amp;quot;,&amp;quot;text&amp;quot;:&amp;quot;Upgrade to paid&amp;quot;,&amp;quot;url&amp;quot;:&amp;quot;https://every.to/subscribe?source=post_button&amp;quot;}" id="quill-button-1779480175311"&gt;&lt;a href="https://every.to/subscribe?source=post_button"&gt;Upgrade to paid&lt;/a&gt;&lt;/div&gt;</description>
      <author>Every Staff / Context Window</author>
      <pubDate>2026-05-24 05:00:00 -0400</pubDate>
      <guid>https://every.to/context-window/cheap-competence-new-frontier</guid>
      <link>https://every.to/context-window/cheap-competence-new-frontier</link>
    </item>
    <item>
      <title>Notes From the Foothills of the Singularity</title>
      <description>&lt;table&gt;&lt;tr&gt;&lt;td&gt;&lt;img alt="Playtesting" src="https://d24ovhgu8s7341.cloudfront.net/uploads/publication/logo/102/small_playtesting.png" /&gt;&lt;/td&gt;&lt;td&gt;&lt;/td&gt;&lt;td&gt;&lt;table&gt;&lt;tr&gt;&lt;td&gt;by &lt;a href="https://every.to/@AlxAi" itemprop="name"&gt;Alex Duffy&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;in &lt;a href="https://every.to/playtesting"&gt;Playtesting&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;figure&gt;&lt;img src="https://d24ovhgu8s7341.cloudfront.net/uploads/post/cover/4271/full_page_cover_64ed47e3278bbb58-Notes_From_the_Foothills_of_the_Singularity.png"&gt;&lt;figcaption&gt;Midjourney/Every illustration.&lt;/figcaption&gt;&lt;/figure&gt;&lt;p&gt;&lt;em&gt;Was this newsletter forwarded to you? &lt;u&gt;&lt;a href="https://every.to/account" rel="noopener noreferrer" target="_blank"&gt;Sign up&lt;/a&gt;&lt;/u&gt; to get it in your inbox.&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;&lt;u&gt;&lt;a href="https://every.to/context-window/google-s-ai-vision-make-tech-human-again" rel="noopener noreferrer" target="_blank"&gt;Last year at Google I/O&lt;/a&gt;&lt;/u&gt;, the company made an overwhelming 100 announcements, including an AI video model—Veo 3—that was miles ahead of anything else at the time. This year had less &lt;em&gt;wow&lt;/em&gt; but &lt;u&gt;&lt;a href="https://every.to/context-window/google-i-o-agents-agents-agents#signal" rel="noopener noreferrer" target="_blank"&gt;more dutiful iteration&lt;/a&gt;&lt;/u&gt;. Gemini 3.5 Flash is faster and more capable than Google’s previous frontier model. Search now builds the right small tool to answer your question on the fly. Gemini assistants can keep running with your laptop closed. Even &lt;u&gt;&lt;a href="https://blog.google/innovation-and-ai/models-and-research/gemini-models/gemini-omni/" rel="noopener noreferrer" target="_blank"&gt;Gemini Omni&lt;/a&gt;&lt;/u&gt;, a new, multi-model world model that intuitively understands gravity, kinetic energy, and fluid dynamics—and will likely help train robots—is, for now, being billed as “Nano Banana for video.”&lt;/p&gt;&lt;p&gt;In a year when competitors like OpenAI continued to throw things at the wall—touting its video model, &lt;u&gt;&lt;a href="https://every.to/vibe-check/openai-made-video-creation-effortless-here-s-what-happened-next" rel="noopener noreferrer" target="_blank"&gt;Sora 2&lt;/a&gt;&lt;/u&gt;, as a ChatGPT moment for video that, according to former head &lt;strong&gt;Bill Peebles&lt;/strong&gt;, would “evolve into a mini alternate reality”—only to shut it down later in the same year. Or leaned into the work market while simultaneously talking, as Anthropic CEO &lt;strong&gt;Dario Amodei&lt;/strong&gt; did, about AI’s potential &lt;u&gt;&lt;a href="https://www.anthropic.com/research/labor-market-impacts" rel="noopener noreferrer" target="_blank"&gt;to decimate entry-level jobs&lt;/a&gt;&lt;/u&gt;, Google’s releases were not flashy. But filling the gaps both within AI’s &lt;u&gt;&lt;a href="https://x.com/karpathy/status/1816531576228053133?lang=en" rel="noopener noreferrer" target="_blank"&gt;jagged intelligence&lt;/a&gt;&lt;/u&gt; and across its products, while getting the tools to people who will use them, is probably orders of magnitude more important.&lt;/p&gt;&lt;div class="quill-block-image" id="quill-block-image-1779476709416-43ugqfdws" data-source="{&amp;quot;dom_id&amp;quot;:&amp;quot;quill-block-image-1779476709416-43ugqfdws&amp;quot;,&amp;quot;link&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4271/optimized_6830ba13-a08f-4e4b-ad44-300164becb9f.png&amp;quot;,&amp;quot;image&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4271/optimized_6830ba13-a08f-4e4b-ad44-300164becb9f.png&amp;quot;,&amp;quot;caption&amp;quot;:&amp;quot;Attendees at this year’s Google I/O, with the swooping, landscape-inspired roof of the company’s Bay View campus buildings. (All photos courtesy of Alex Duffy.)&amp;quot;,&amp;quot;error&amp;quot;:null}"&gt;&lt;div&gt;&lt;a href="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4271/optimized_6830ba13-a08f-4e4b-ad44-300164becb9f.png" target="_blank" rel="noopener noreferrer"&gt;&lt;img src="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4271/optimized_6830ba13-a08f-4e4b-ad44-300164becb9f.png" alt="Attendees at this year’s Google I/O, with the swooping, landscape-inspired roof of the company’s Bay View campus buildings. (All photos courtesy of Alex Duffy.)"&gt;&lt;/a&gt;&lt;figcaption class="quill-image-caption"&gt;Attendees at this year’s Google I/O, with the swooping, landscape-inspired roof of the company’s Bay View campus buildings. (All photos courtesy of Alex Duffy.)&lt;/figcaption&gt;&lt;/div&gt;&lt;/div&gt;&lt;p&gt;&lt;br&gt;&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Demis Hassabis&lt;/strong&gt;, CEO of Google DeepMind, called this moment the “foothills of the singularity.” He puts &lt;u&gt;&lt;a href="https://every.to/chain-of-thought/toward-a-definition-of-agi" rel="noopener noreferrer" target="_blank"&gt;artificial general intelligence (AGI)&lt;/a&gt;&lt;/u&gt; “just a few years” out and its total impact at 10 times the Industrial Revolution, and arriving 10 times faster. We now have the ability to automate almost anything we can capture reliable data on, but one of the biggest hurdles is convincing society that it’s worth investing in that ability. Right now most people &lt;u&gt;&lt;a href="https://www.nbcnews.com/politics/politics-news/poll-majority-voters-say-risks-ai-outweigh-benefits-rcna262196" rel="noopener noreferrer" target="_blank"&gt;don’t think it is&lt;/a&gt;&lt;/u&gt;.&lt;/p&gt;&lt;p&gt;Hassabis called out explicitly that “it’s incumbent on the field, our field, the AI field and industry to show the unequivocal benefits more clearly and more concretely.” My impression, after this year’s conference, is that Google sees the precarity of the current moment clearly, and its scale gives it a rare position to do something about it.&lt;/p&gt;&lt;h2&gt;The loop&lt;/h2&gt;&lt;p&gt;Google’s loop works like this: Researchers find new data, improve the model architecture, and train a new one. The model is trained specifically to fit into their &lt;u&gt;&lt;a href="https://antigravity.google/" rel="noopener noreferrer" target="_blank"&gt;“Antigravity”&lt;/a&gt;&lt;/u&gt; harness, giving it the ability to write and run code, and therefore do pretty much anything else. The company then applies it across every product: Search, Docs, YouTube, Gmail, Android, the works. Users try it out and provide feedback implicitly through behavior and explicitly with thumbs up or down ratings. The next model improves. Everything happens across Google’s full stack—the chips it designs, the data centers it owns, the models, the deployment pipeline, billions of users on more than half a dozen core apps. This past year has been about realigning the organization to run that loop at scale.&lt;/p&gt;&lt;p&gt;Internal tools are being rewritten to be 20 times faster and built for agents. Google is looking at how experts within and outside of the organization work, collecting that high-quality data, identifying the underlying capability gaps, then training models to overcome them.&lt;/p&gt;&lt;p&gt;It shows up as a search box that can build a custom widget for your question on the fly, helping drive home a deeper understanding than a headline. Or in an easier-to-use Gemini app, which just passed 900 million monthly users and will soon have a 24/7 personal agent doing research across your emails, catching tasks and running with them asynchronously, returning drafts, reports, itineraries, and more. Google’s adding new agents to surfaces across its family of apps like Maps and Shopping, all of them powered by Gemini 3.5 Flash and the Antigravity harness—the same combination that can &lt;u&gt;&lt;a href="https://antigravity.google/blog/google-antigravity-built-an-os" rel="noopener noreferrer" target="_blank"&gt;build a working operating system&lt;/a&gt;&lt;/u&gt; in 12 hours with 93 sub-agents for under $1,000. None of that was possible six months ago. Now billions of people will use these tools to pursue their goals, often without realizing that they’re using them.&lt;/p&gt;&lt;div class="quill-block-image" id="quill-block-image-1779476709437-ijdbkboa2" data-source="{&amp;quot;dom_id&amp;quot;:&amp;quot;quill-block-image-1779476709437-ijdbkboa2&amp;quot;,&amp;quot;link&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4271/optimized_3b51617d-965b-48c3-a047-e4df209648c6.jpeg&amp;quot;,&amp;quot;image&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4271/optimized_3b51617d-965b-48c3-a047-e4df209648c6.jpeg&amp;quot;,&amp;quot;caption&amp;quot;:&amp;quot;Google Deepmind CEO Demis Hassabis at the “AI and the frontiers of science” session on the second day of the conference.&amp;quot;,&amp;quot;error&amp;quot;:null}"&gt;&lt;div&gt;&lt;a href="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4271/optimized_3b51617d-965b-48c3-a047-e4df209648c6.jpeg" target="_blank" rel="noopener noreferrer"&gt;&lt;img src="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4271/optimized_3b51617d-965b-48c3-a047-e4df209648c6.jpeg" alt="Google Deepmind CEO Demis Hassabis at the “AI and the frontiers of science” session on the second day of the conference."&gt;&lt;/a&gt;&lt;figcaption class="quill-image-caption"&gt;Google Deepmind CEO Demis Hassabis at the “AI and the frontiers of science” session on the second day of the conference.&lt;/figcaption&gt;&lt;/div&gt;&lt;/div&gt;&lt;h2&gt;The obligation&lt;/h2&gt;&lt;p&gt;A year ago, Google processed 480 trillion tokens a month. Last month, that number was 3.2 &lt;em&gt;quadrillion&lt;/em&gt;—3 trillion a day, doubling every three weeks. Its capital expenditures this year were around $180 billion, almost six times what it was in 2022. But so far, the general public is &lt;u&gt;&lt;a href="https://www.nbcnews.com/politics/politics-news/poll-majority-voters-say-risks-ai-outweigh-benefits-rcna262196" rel="noopener noreferrer" target="_blank"&gt;not convinced&lt;/a&gt;&lt;/u&gt; that the investment is worth it. What most people see, instead, is white-collar layoffs, resource-hungry data centers going up in their back yards, and a small group getting very rich.&lt;/p&gt;&lt;p&gt;My Uber driver back from Mountain View to San Francisco was 54 years old, still works in construction, and optimized his routes around the goings-on of his city with which he was intimately familiar. He’d never heard of Hassabis or how games could &lt;u&gt;&lt;a href="https://every.to/playtesting/we-trained-an-ai-on-a-board-game-it-became-a-better-customer-support-agent-299b5938-09dd-4881-803f-aea21f0d461f" rel="noopener noreferrer" target="_blank"&gt;help teach AI&lt;/a&gt;&lt;/u&gt;, but was curious about what happened at I/O. He opened our conversation with a worry about layoffs, the rich getting richer, and the question of who would be left to spend in the economy. I asked a lot of questions and mentioned how Hassabis emphasized the obligation of the industry to “show the unequivocal benefits of AI more clearly.” I shared my admiration for Hassabis’s clear, vocal focus on curing all disease, and the progress made so far thanks to &lt;u&gt;&lt;a href="https://every.to/context-window/ai-work-is-splitting-in-two#alignment" rel="noopener noreferrer" target="_blank"&gt;AlphaFold&lt;/a&gt;&lt;/u&gt;. We talked about how one person could now do &lt;u&gt;&lt;a href="https://every.to/chain-of-thought/the-two-slice-team" rel="noopener noreferrer" target="_blank"&gt;what used to take a team&lt;/a&gt;&lt;/u&gt;, and how that opens room for more small businesses, though the road there may be pocked with layoffs. By the time we arrived in San Francisco, he had moved the &lt;u&gt;&lt;a href="https://www.youtube.com/watch?v=d95J8yzvjbQ" rel="noopener noreferrer" target="_blank"&gt;YouTube documentary he’d saved&lt;/a&gt;&lt;/u&gt; to the top of his watch list.&lt;/p&gt;&lt;p&gt;I think people &lt;em&gt;want&lt;/em&gt; to be excited. The promise is real—AI is the best general-purpose tool we’ve ever had for science. Data centers &lt;u&gt;&lt;a href="https://progresschamber.org/insights/data-centers-cut-property-taxes-virginia-homeowners/" rel="noopener noreferrer" target="_blank"&gt;already pay&lt;/a&gt;&lt;/u&gt; half of some counties’ property tax revenue, lessening the burden on everyday people and providing dramatically better returns on resources like water &lt;u&gt;&lt;a href="https://x.com/Smirkley/status/2056768120962814099" rel="noopener noreferrer" target="_blank"&gt;than alternatives&lt;/a&gt;&lt;/u&gt;. On the horizon are cures we’ve been chasing for decades, materials that could increase our energy efficiency while reducing our footprint, and education that adapts to the learner. Self-driving cars could save tens of thousands of American lives a year and provide the freedom of mobility to many. They will also be coming for my driver’s job. The promise arrives at scale, but the cost arrives household by household. Unless the industry shows upsides as tangible as today’s downsides, whether actual or perceived, and invests in the people displaced first, progress will slow.&lt;/p&gt;&lt;div class="quill-block-image" id="quill-block-image-1779476709439-6rt8m84zx" data-source="{&amp;quot;dom_id&amp;quot;:&amp;quot;quill-block-image-1779476709439-6rt8m84zx&amp;quot;,&amp;quot;link&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4271/optimized_b3abb9e0-0a40-4b4f-a46b-c938ffdcaef6.jpeg&amp;quot;,&amp;quot;image&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4271/optimized_b3abb9e0-0a40-4b4f-a46b-c938ffdcaef6.jpeg&amp;quot;,&amp;quot;caption&amp;quot;:&amp;quot;James Manyika, president of Research, Labs, Technology &amp;amp; Society at Google and Alphabet (left), in conversation with Hartmut Neven, founder and lead of Google Quantum AI, who is holding up one of Google’s Willow quantum chips.&amp;quot;,&amp;quot;error&amp;quot;:null}"&gt;&lt;div&gt;&lt;a href="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4271/optimized_b3abb9e0-0a40-4b4f-a46b-c938ffdcaef6.jpeg" target="_blank" rel="noopener noreferrer"&gt;&lt;img src="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4271/optimized_b3abb9e0-0a40-4b4f-a46b-c938ffdcaef6.jpeg" alt="James Manyika, president of Research, Labs, Technology &amp;amp; Society at Google and Alphabet (left), in conversation with Hartmut Neven, founder and lead of Google Quantum AI, who is holding up one of Google’s Willow quantum chips."&gt;&lt;/a&gt;&lt;figcaption class="quill-image-caption"&gt;James Manyika, president of Research, Labs, Technology &amp;amp; Society at Google and Alphabet (left), in conversation with Hartmut Neven, founder and lead of Google Quantum AI, who is holding up one of Google’s Willow quantum chips.&lt;/figcaption&gt;&lt;/div&gt;&lt;/div&gt;&lt;p&gt;&lt;br&gt;&lt;/p&gt;&lt;p&gt;The window is open. Google and others have built the infrastructure to run this cycle at scale and put it in the hands of billions. This past week mathematicians &lt;u&gt;&lt;a href="https://openai.com/index/model-disproves-discrete-geometry-conjecture/" rel="noopener noreferrer" target="_blank"&gt;used a frontier model&lt;/a&gt;&lt;/u&gt; to uncover a mathematical secret which had eluded us for 80 years, disproving a long-standing conjecture in discrete geometry. That used to require a PhD or a team. Now it can mean one curious person and a coding agent. What’s left is to point these tools at problems worth solving right now, that produce visible benefits for individuals and communities alike. Announcements like the &lt;u&gt;&lt;a href="https://www.geminixprize.com" rel="noopener noreferrer" target="_blank"&gt;Gemini XPRIZE&lt;/a&gt;&lt;/u&gt;, which aims to do just this, show that the company understands the urgency of the moment. As does just simply getting the tools into the hands of more people, especially when the learning curve is as shallow as asking a question.&lt;/p&gt;&lt;p&gt;I’m excited about the robotics updates and the world models being built for simulation. The bigger moonshots are coming. But the work most worth doing right now is the work in front of us, with the people around us. The future, in Hassabis’s words, is yet to be written. But we must also be careful with direction and not mistake activity with achievement. The stakes are high. The conversations we have, the stories we tell, and the way we use these tools today will define what comes tomorrow.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;&lt;strong&gt;&lt;em&gt;Alex Duffy&lt;/em&gt;&lt;/strong&gt;&lt;em&gt; is the cofounder and CEO of &lt;u&gt;&lt;a href="https://goodstartlabs.com/" rel="noopener noreferrer" target="_blank"&gt;Good Start Labs&lt;/a&gt;&lt;/u&gt; and a contributing writer. &lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;To read more essays like this, subscribe to &lt;u&gt;&lt;a href="https://every.to/subscribe" rel="noopener noreferrer" target="_blank"&gt;Every&lt;/a&gt;&lt;/u&gt;, and follow us on X at &lt;u&gt;&lt;a href="http://twitter.com/every" rel="noopener noreferrer" target="_blank"&gt;@every&lt;/a&gt;&lt;/u&gt; and on &lt;u&gt;&lt;a href="https://www.linkedin.com/company/everyinc/" rel="noopener noreferrer" target="_blank"&gt;LinkedIn&lt;/a&gt;&lt;/u&gt;.&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;We &lt;u&gt;&lt;a href="https://every.to/studio" rel="noopener noreferrer" target="_blank"&gt;build AI tools&lt;/a&gt;&lt;/u&gt; for readers like you. Write brilliantly with &lt;/em&gt;&lt;strong&gt;&lt;em&gt;&lt;u&gt;&lt;a href="https://writewithspiral.com/" rel="noopener noreferrer" target="_blank"&gt;Spiral&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;&lt;/strong&gt;&lt;em&gt;. Organize files automatically with &lt;/em&gt;&lt;strong&gt;&lt;em&gt;&lt;u&gt;&lt;a href="https://makeitsparkle.co/?utm_source=everyfooter" rel="noopener noreferrer" target="_blank"&gt;Sparkle&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;&lt;/strong&gt;&lt;em&gt;. Deliver yourself from email with &lt;/em&gt;&lt;strong&gt;&lt;em&gt;&lt;u&gt;&lt;a href="https://cora.computer/" rel="noopener noreferrer" target="_blank"&gt;Cora&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;&lt;/strong&gt;&lt;em&gt;. Dictate effortlessly with &lt;/em&gt;&lt;strong&gt;&lt;em&gt;&lt;u&gt;&lt;a href="https://monologue.to/" rel="noopener noreferrer" target="_blank"&gt;Monologue&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;&lt;/strong&gt;&lt;em&gt;. Collaborate with agents on documents with &lt;/em&gt;&lt;strong&gt;&lt;em&gt;&lt;a href="https://www.proofeditor.ai/" rel="noopener noreferrer" target="_blank"&gt;Proof&lt;/a&gt;&lt;/em&gt;&lt;/strong&gt;&lt;em&gt;. &lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;Help us scale the only subscription you need to stay at the edge of AI. Explore &lt;u&gt;&lt;a href="https://www.notion.so/Jobs-Every-25cca4f355ac80c5ad6ee7a6e93d6b4e?pvs=21" rel="noopener noreferrer" target="_blank"&gt;open roles at Every&lt;/a&gt;&lt;/u&gt;.&lt;/em&gt;&lt;/p&gt;&lt;div class="quill-button" data-source="{&amp;quot;id&amp;quot;:&amp;quot;quill-button-1769187301610&amp;quot;,&amp;quot;url&amp;quot;:&amp;quot;https://every.to/subscribe?source=post_button&amp;quot;,&amp;quot;text&amp;quot;:&amp;quot;Subscribe&amp;quot;}" id="quill-button-1769187301610"&gt;&lt;a href="https://every.to/subscribe?source=post_button"&gt;Subscribe&lt;/a&gt;&lt;/div&gt;</description>
      <author>Alex Duffy / Playtesting</author>
      <pubDate>2026-05-22 05:00:00 -0400</pubDate>
      <guid>https://every.to/playtesting/notes-from-the-foothills-of-the-singularity</guid>
      <link>https://every.to/playtesting/notes-from-the-foothills-of-the-singularity</link>
    </item>
    <item>
      <title>After Automation</title>
      <description>&lt;table&gt;&lt;tr&gt;&lt;td&gt;&lt;/td&gt;&lt;td&gt;&lt;table&gt;&lt;tr&gt;&lt;td&gt;by &lt;a href="https://every.to/@danshipper" itemprop="name"&gt;Dan Shipper&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;figure&gt;&lt;img src="https://d24ovhgu8s7341.cloudfront.net/uploads/post/cover/4265/full_page_cover_ed1eeb6ae5ed77d8-Cover_image_for_manifesto.png"&gt;&lt;figcaption&gt;Midjourney/Every illustration.&lt;/figcaption&gt;&lt;/figure&gt;&lt;p&gt;&lt;em&gt;Was this newsletter forwarded to you? &lt;u&gt;&lt;a href="https://every.to/account" rel="noopener noreferrer" target="_blank"&gt;Sign up&lt;/a&gt;&lt;/u&gt; to get it in your inbox.&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;We’ve automated everything we can here at &lt;u&gt;&lt;a href="https://every.to/" rel="noopener noreferrer" target="_blank"&gt;Every&lt;/a&gt;&lt;/u&gt;. Agents write our code, draft our emails, handle customer support, and help compile the newsletter. We alpha-test new models before they launch. We use AI in every way imaginable to build and ship everything we touch. We go as far and as fast as possible.&lt;/p&gt;&lt;p&gt;Yet there’s more human work to do than ever.&lt;/p&gt;&lt;p&gt;Today we’re publishing &lt;a href="https://every.to/p/after-automation" rel="noopener noreferrer" target="_blank"&gt;“After Automation.”&lt;/a&gt; It’s something I’ve been working through for a while. The popular narrative is that AI will eliminate human work. But I think technological progress creates more for people to do, not less. And that’s a good thing.&lt;/p&gt;&lt;p&gt;This report traces what happens when cheap competence floods in and creates sameness, and how no matter how good AI gets at executing complex tasks, there will always be a new frame for humans to hand it. I’ve included examples from inside Every: how we embed our agents, what benchmarks we use, prompt engineering we play with, and what the work looks like when humans stay structurally ahead of the models.&lt;/p&gt;&lt;p&gt;Of course, this report is agent-native. Drop it into Codex or Claude and argue with it to your heart’s content.&lt;/p&gt;&lt;div class="quill-button" data-source="{&amp;quot;id&amp;quot;:&amp;quot;quill-button-1779381801526&amp;quot;,&amp;quot;text&amp;quot;:&amp;quot;Read \&amp;quot;After Automation\&amp;quot;&amp;quot;,&amp;quot;url&amp;quot;:&amp;quot;https://every.to/p/after-automation?source=post_button&amp;quot;}" id="quill-button-1779381801526"&gt;&lt;a href="https://every.to/p/after-automation?source=post_button"&gt;Read "After Automation"&lt;/a&gt;&lt;/div&gt;&lt;div class="quill-button" data-source="{&amp;quot;id&amp;quot;:&amp;quot;quill-button-1779381821373&amp;quot;,&amp;quot;text&amp;quot;:&amp;quot;Watch the video&amp;quot;,&amp;quot;url&amp;quot;:&amp;quot;https://www.youtube.com/watch?v=jBZQ5Ay20HU&amp;amp;feature=youtu.be&amp;amp;source=post_button&amp;quot;}" id="quill-button-1779381821373"&gt;&lt;a href="https://www.youtube.com/watch?v=jBZQ5Ay20HU&amp;amp;feature=youtu.be&amp;amp;source=post_button"&gt;Watch the video&lt;/a&gt;&lt;/div&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;strong&gt;&lt;em&gt;&lt;a href="https://every.to/@danshipper" rel="noopener noreferrer" target="_blank"&gt;Dan Shipper&lt;/a&gt;&lt;/em&gt;&lt;/strong&gt; &lt;em&gt;is the cofounder and CEO of Every, where he writes the&lt;/em&gt; &lt;em&gt;&lt;a href="https://every.to/chain-of-thought" rel="noopener noreferrer" target="_blank"&gt;Chain of Thought&lt;/a&gt;&lt;/em&gt; &lt;em&gt;column and hosts the podcast&lt;/em&gt; &lt;a href="https://open.spotify.com/show/5qX1nRTaFsfWdmdj5JWO1G" rel="noopener noreferrer" target="_blank"&gt;AI &amp;amp; I&lt;/a&gt;. &lt;em&gt;You can follow him on X at&lt;/em&gt; &lt;em&gt;&lt;a href="https://twitter.com/danshipper" rel="noopener noreferrer" target="_blank"&gt;@danshipper&lt;/a&gt;&lt;/em&gt; &lt;em&gt;and on&lt;/em&gt; &lt;em&gt;&lt;a href="https://www.linkedin.com/in/danshipper/" rel="noopener noreferrer" target="_blank"&gt;LinkedIn&lt;/a&gt;. &lt;/em&gt;</description>
      <author>Dan Shipper</author>
      <pubDate>2026-05-21 10:00:00 -0400</pubDate>
      <guid>https://every.to/p/after-automation</guid>
      <link>https://every.to/p/after-automation</link>
    </item>
    <item>
      <title>Google I/O: Agents, Agents, Agents</title>
      <description>&lt;table&gt;&lt;tr&gt;&lt;td&gt;&lt;img alt="Context Window" src="https://d24ovhgu8s7341.cloudfront.net/uploads/publication/logo/94/small_context_windown_1.png" /&gt;&lt;/td&gt;&lt;td&gt;&lt;/td&gt;&lt;td&gt;&lt;table&gt;&lt;tr&gt;&lt;td&gt;by &lt;a href="https://every.to/@jackcheng" itemprop="name"&gt;Jack Cheng&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;in &lt;a href="https://every.to/context-window"&gt;Context Window&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;figure&gt;&lt;img src="https://d24ovhgu8s7341.cloudfront.net/uploads/post/cover/4269/full_page_cover_74e636b1fd85c943-Cover_image_for_today__1_.png"&gt;&lt;figcaption&gt;Midjourney/Every illustration.&lt;/figcaption&gt;&lt;/figure&gt;&lt;p&gt;Google I/O dominated the week, and the message from Mountain View was unsubtle: Agents are now the product, with Gemini 3.5 Flash powering a redesigned search and a new fleet of always-on assistants. One layer down, Anthropic paid a reported $300 million for Stainless—so we’re re-upping our &lt;em&gt;&lt;a href="https://every.to/podcast" rel="noopener noreferrer" target="_blank"&gt;AI &amp;amp; I&lt;/a&gt;&lt;/em&gt; episode with CEO &lt;strong&gt;Alex Rattray&lt;/strong&gt;, who laid out the design principles for making software legible to agents months before the deal happened. Plus: We did a mini-&lt;a href="https://every.to/vibe-check" rel="noopener noreferrer" target="_blank"&gt;Vibe Check&lt;/a&gt; of Figma’s new in-canvas agent to see whether it solves the blank-page problem.—&lt;em&gt;&lt;a href="https://every.to/@kate_1767" rel="noopener noreferrer" target="_blank"&gt;Kate Lee&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;Was this newsletter forwarded to you? &lt;u&gt;&lt;a href="https://every.to/account" rel="noopener noreferrer" target="_blank"&gt;Sign up&lt;/a&gt;&lt;/u&gt; to get it in your inbox.&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;h3&gt;Spotlight&lt;/h3&gt;&lt;h4&gt;Alex Rattray, Stainless CEO and MCP whisperer&lt;/h4&gt;&lt;p&gt;Flashy frontier &lt;u&gt;&lt;a href="https://every.to/vibe-check/gpt-5-5" rel="noopener noreferrer" target="_blank"&gt;model releases&lt;/a&gt;&lt;/u&gt; suck up most of the oxygen in the AI ecosystem. But without reliable ways for AI agents to access these models, their capabilities are limited. This plumbing may be easy to overlook, but it’s an indispensable component of an agent-native internet. &lt;/p&gt;&lt;p&gt;You don’t have to take our word for it. On Monday, Anthropic announced it has &lt;u&gt;&lt;a href="https://www.anthropic.com/news/anthropic-acquires-stainless" rel="noopener noreferrer" target="_blank"&gt;acquired Stainless&lt;/a&gt;&lt;/u&gt;, a software platform for high-quality APIs, to extend Claude’s ability to connect to data and tools. (While terms weren’t disclosed, The Information put the purchase price at north of &lt;u&gt;&lt;a href="https://www.theinformation.com/articles/anthropic-talks-buy-developer-tools-startup-used-openai-google?rc=ekymys" rel="noopener noreferrer" target="_blank"&gt;$300 million&lt;/a&gt;&lt;/u&gt;.) Former Stainless customers include OpenAI and Google, meaning Anthropic has acquired a developer tooling company used by its top rivals.&lt;/p&gt;&lt;p&gt;In October, Stainless CEO and founder &lt;strong&gt;&lt;u&gt;&lt;a href="https://www.linkedin.com/in/alexrattray/" rel="noopener noreferrer" target="_blank"&gt;Alex Rattray&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; joined &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@danshipper" rel="noopener noreferrer" target="_blank"&gt;Dan Shipper&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; on &lt;em&gt;&lt;a href="https://every.to/podcast" rel="noopener noreferrer" target="_blank"&gt;AI &amp;amp; I&lt;/a&gt; &lt;/em&gt;to talk about why teaching models to use software is so tricky, and what &lt;u&gt;&lt;a href="https://every.to/podcast/he-s-building-the-plumbing-for-ai-to-use-the-internet" rel="noopener noreferrer" target="_blank"&gt;design principles&lt;/a&gt;&lt;/u&gt; make model context protocol (MCP) servers more intuitive for LLMs. (TL;DR: Keep the number of tools an agent can access small, give the tools precise names, and aim to generate tightly defined outputs.) In the episode, Alex goes deep on Stainless’s approach to making it easier for AI agents to use the internet—hard-won insights that, as it turns out, can lead to a big-sticker acquisition from a top model company. [Disclosure: Dan is a small investor in Stainless.]&lt;/p&gt;&lt;p&gt;Read Anthropic’s &lt;u&gt;&lt;a href="https://www.anthropic.com/news/anthropic-acquires-stainless" rel="noopener noreferrer" target="_blank"&gt;announcement&lt;/a&gt;&lt;/u&gt; about its decision to buy Stainless and then watch Rattray’s &lt;em&gt;AI &amp;amp; I&lt;/em&gt; episode &lt;a href="https://x.com/danshipper/status/2057122805657821240" rel="noopener noreferrer" target="_blank"&gt;on X&lt;/a&gt; or &lt;u&gt;&lt;a href="https://youtu.be/diXNk8ibJVk" rel="noopener noreferrer" target="_blank"&gt;YouTube&lt;/a&gt;&lt;/u&gt;, or listen on &lt;u&gt;&lt;a href="https://open.spotify.com/episode/2xKWTcJkEzJLPxChgXmHvg?si=XXbLCfDURE6AJmJh60b86g" rel="noopener noreferrer" target="_blank"&gt;Spotify&lt;/a&gt;&lt;/u&gt; or &lt;u&gt;&lt;a href="https://podcasts.apple.com/us/podcast/inside-stainless-the-developer-tools-startup/id1719789201?i=1000768755708" rel="noopener noreferrer" target="_blank"&gt;Apple Podcasts&lt;/a&gt;&lt;/u&gt; (or read the episode &lt;u&gt;&lt;a href="https://every.to/podcast/inside-stainless-the-developer-tools-startup-anthropic-just-bought-for-300-million" rel="noopener noreferrer" target="_blank"&gt;transcript&lt;/a&gt;&lt;/u&gt;).—&lt;em&gt;&lt;u&gt;&lt;a href="https://every.to/@laura_27bbaf_1" rel="noopener noreferrer" target="_blank"&gt;Laura Entis&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;&lt;/p&gt;&lt;div class="quill-youtube" id="undefined" data-source="{&amp;quot;url&amp;quot;:&amp;quot;https://youtu.be/diXNk8ibJVk&amp;quot;,&amp;quot;height&amp;quot;:&amp;quot;400&amp;quot;,&amp;quot;youtube_id&amp;quot;:&amp;quot;diXNk8ibJVk&amp;quot;}" data-height="400" data-youtube-id="diXNk8ibJVk" style="max-height: 400px; overflow: hidden;"&gt;&lt;a href="https://youtu.be/diXNk8ibJVk" target="_blank" rel="noopener noreferrer"&gt;&lt;img src="https://img.youtube.com/vi/diXNk8ibJVk/maxresdefault.jpg" style="width: 100%; aspect-ratio: 16 / 9; display: block;"&gt;&lt;div class="play"&gt;&lt;img src="https://d24ovhgu8s7341.cloudfront.net/static/emails/youtube-logo.png"&gt;&lt;/div&gt;&lt;/a&gt;&lt;/div&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;h2&gt;Signal&lt;/h2&gt;&lt;h4&gt;Google goes all-in on agents&lt;/h4&gt;&lt;p&gt;We’re hurtling toward an AI landscape divided into &lt;u&gt;&lt;a href="https://every.to/context-window/the-dawn-of-codex-native-apps" rel="noopener noreferrer" target="_blank"&gt;two categories&lt;/a&gt;&lt;/u&gt; of agents: those you collaborate with, and those you delegate to. Google’s new releases from its flagship I/O developer conference, happening this week in San Francisco, break neatly along that line. &lt;/p&gt;&lt;p&gt;The headline announcement is Gemini 3.5 Flash, Google’s just-announced frontier model it says operates four times faster and at half the cost of comparable LLMs. It’s the engine powering most of the agentic features below.&lt;/p&gt;&lt;h5&gt;&lt;strong&gt;In the ‘collaborate with’ bucket&lt;/strong&gt;&lt;/h5&gt;&lt;p&gt;&lt;strong&gt;AI Mode and the new search box: &lt;/strong&gt;Google is giving search its biggest interface change in 25 years. In addition to expanding the search box to accommodate longer, more conversational questions and terms from users, AI Mode, which Google introduced at &lt;u&gt;&lt;a href="https://every.to/context-window/google-s-ai-vision-make-tech-human-again" rel="noopener noreferrer" target="_blank"&gt;last year’s I/O conference&lt;/a&gt;&lt;/u&gt;, is becoming the default search mode. With the 2026 updates,  you can now build custom mini-apps, such as a personalized fitness tracker, or interactive visualizations directly within search itself. &lt;/p&gt;&lt;p&gt;&lt;strong&gt;Antigravity 2.0&lt;/strong&gt;: Google’s agentic development platform is becoming a desktop app for managing teams of agents, with a new command line interface tool and an SDK for custom workflows. You orchestrate, and the agents code, design, or do whatever else you want them to accomplish. &lt;/p&gt;&lt;h5&gt;&lt;strong&gt;In the ‘delegate to’ bucket&lt;/strong&gt;&lt;/h5&gt;&lt;p&gt;&lt;strong&gt;Gemini Spark&lt;/strong&gt;: Google is pitching Spark as a 24/7 personal agent that lives in the cloud, works when your devices are off, and can operate across Gmail, Docs, Workspace, Chrome, and eventually, third-party tools through MCP.&lt;strong&gt; &lt;/strong&gt;“You can just throw tasks over your shoulder,” &lt;strong&gt;Josh Woodward&lt;/strong&gt;, vice president of Google Labs, Gemini, and AI Studio said in the keynote. “Spark will catch them and then run with them.”&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Daily Brief&lt;/strong&gt;: An out-of-the-box agent in the updated Gemini app that works overnight, scanning your inbox, calendar, and tasks so it can hand you a prioritized digest when you wake up in the morning. &lt;/p&gt;&lt;p&gt;&lt;strong&gt;Universal Cart:&lt;/strong&gt; Google’s new shopping cart works across merchants as part of the Universal Commerce Protocol, which it co-developed with Amazon, Meta, Microsoft, and others. Whenever you add something in your cart, it automatically monitors the internet for information on the product, including price drops, price history, and whether something is back in stock. It also analyzes the full contents of your cart to proactively flag potential issues, like if you’re building a PC and the processor and motherboard you’ve selected are incompatible.  &lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;h2&gt;Inside Google I/O&lt;/h2&gt;&lt;h4&gt;Anyone can cook&lt;/h4&gt;&lt;p&gt;Gemini 3.5 Flash, announced in Tuesday’s opening keynote, seems like a meaningful step toward a fast and cheap model that can reliably handle the personal, everyday tasks that most people are looking for help with.&lt;/p&gt;&lt;p&gt;When is a model good &lt;em&gt;enough&lt;/em&gt;? That was the question I asked myself heading back to my hotel after the first day of Google I/O. I often send agents on multi-hour coding missions, and need to pull together data from multiple accounts and channels to coordinate my workday. In these cases, each new model release seems to work better than the last. So I eagerly hop from one to another.&lt;/p&gt;&lt;p&gt;On the other hand, for simple, personal tasks like household briefings, tracking my journaling and meditation habits, and light web development, I am loyal to &lt;u&gt;&lt;a href="https://every.to/vibe-check/vibe-check-anthropic-just-made-opus-cheaper-without-calling-it-that" rel="noopener noreferrer" target="_blank"&gt;Sonnet 4.6&lt;/a&gt;&lt;/u&gt;—although sometimes I have to tell it to ask Opus or &lt;u&gt;&lt;a href="https://every.to/vibe-check/gpt-5-5" rel="noopener noreferrer" target="_blank"&gt;GPT-5.5&lt;/a&gt;&lt;/u&gt; for help.&lt;/p&gt;&lt;p&gt;But once a model like Sonnet grew smart enough to handle anything personal I might throw at it, I wondered, what else might I want from it?&lt;/p&gt;&lt;p&gt;I’d want it to be blazingly fast, so that I wasn’t waiting for responses when I was working with it in real-time. I’d also want it to cost next to nothing.&lt;/p&gt;&lt;p&gt;Gemini 3.5 Flash may offer exactly that.&lt;/p&gt;&lt;div class="quill-block-image" id="quill-block-image-1779299084123" data-source="{&amp;quot;dom_id&amp;quot;:&amp;quot;quill-block-image-1779299084123&amp;quot;,&amp;quot;link&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4269/optimized_9e2801e9-d98c-4cc6-89a9-3c206c37d4e3.png&amp;quot;,&amp;quot;image&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4269/optimized_9e2801e9-d98c-4cc6-89a9-3c206c37d4e3.png&amp;quot;,&amp;quot;caption&amp;quot;:&amp;quot;Gemini 3.5 Flash is in a quadrant of its own. (Photo courtesy of Jack Cheng.)&amp;quot;,&amp;quot;error&amp;quot;:null}"&gt;&lt;div&gt;&lt;a href="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4269/optimized_9e2801e9-d98c-4cc6-89a9-3c206c37d4e3.png" target="_blank" rel="noopener noreferrer"&gt;&lt;img src="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4269/optimized_9e2801e9-d98c-4cc6-89a9-3c206c37d4e3.png" alt="Gemini 3.5 Flash is in a quadrant of its own. (Photo courtesy of Jack Cheng.)"&gt;&lt;/a&gt;&lt;figcaption class="quill-image-caption"&gt;Gemini 3.5 Flash is in a quadrant of its own. (Photo courtesy of Jack Cheng.)&lt;/figcaption&gt;&lt;/div&gt;&lt;/div&gt;&lt;p&gt;&lt;br&gt;&lt;/p&gt;&lt;p&gt;If the benchmarks are to be believed, then Gemini 3.5 Flash delivers &lt;u&gt;&lt;a href="https://every.to/vibe-check/opus-4-7" rel="noopener noreferrer" target="_blank"&gt;Opus 4.7&lt;/a&gt;&lt;/u&gt;-level intelligence at four times the speed. Accurate, near-instantaneous responses let Google believably send users from search results pages into its “AI Mode” without them realizing that they’ve entered a new state. A chat interface, after all, is not that far off from a search box. But for that chat interface to still feel like Google search, it has to be just as snappy as traditional search.&lt;/p&gt;&lt;p&gt;It remains to be seen how users will take to the deeper AI mode integration once the update rolls out globally, as it’s beginning to do this week. But Google says 2.5 billion people already use the “AI Overviews” at the top of results pages, and these summaries will now let you ask questions in response. Every search becomes the start of a conversation with an AI agent that can generate text and images, spin up research agents, code up interactive widgets and mini-apps, and more. &lt;/p&gt;&lt;p&gt;This could lead many more people to experience their first “aha moment” with AI. Google’s core competencies around speed and scale really come through in the Gemini 3.5 Flash release. &lt;/p&gt;&lt;p&gt;The context it already has on users though their Gmail, Google Calendar, and Google Docs accounts removes one of the main headaches in setting up AI agents. Google is perhaps one of two companies in the world—along with Apple (which will &lt;em&gt;also&lt;/em&gt; be using Gemini to power its own coming AI integration)—with moats of this size. Pretty soon, billions of people could be newly using agentic AI to cook up tools and workflows that make their lives easier or more enjoyable in some small way.&lt;/p&gt;&lt;p&gt;Oddly enough, Google’s announcements at I/O so far don’t affect those of us riding the edge of the AI wave. Reception to the day’s announcements in Every’s Slack was tepid. But I don’t think Google’s keynotes were speaking to people tinkering with &lt;u&gt;&lt;a href="https://every.to/guides/claw-school" rel="noopener noreferrer" target="_blank"&gt;OpenClaw&lt;/a&gt;&lt;/u&gt; or using and building &lt;u&gt;&lt;a href="https://every.to/context-window/the-dawn-of-codex-native-apps" rel="noopener noreferrer" target="_blank"&gt;Codex-native apps&lt;/a&gt;&lt;/u&gt; to do their email and learn piano.&lt;/p&gt;&lt;p&gt;To me, the significance of Gemini 3.5 Flash and Google’s AI search announcement, amid a sea of other announcements, was underscored by one of the last slides of one of the last developer sessions of the first day. It read:&lt;/p&gt;&lt;p&gt;“We are the first generation of builders creating tools for a world where anyone can build anything.”—&lt;em&gt;&lt;u&gt;&lt;a href="https://every.to/@jackcheng" rel="noopener noreferrer" target="_blank"&gt;Jack Cheng&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;h2&gt;Log on&lt;/h2&gt;&lt;p&gt;We host &lt;u&gt;&lt;a href="https://every.to/events" rel="noopener noreferrer" target="_blank"&gt;camps and workshops&lt;/a&gt;&lt;/u&gt; on topics like &lt;u&gt;&lt;a href="http://youtube.com/watch?v=7YUBxMTF1Tc&amp;amp;time_continue=3&amp;amp;source_ve_path=NzY3NTg&amp;amp;embeds_referring_euri=https%3A%2F%2Fevery.to%2F" rel="noopener noreferrer" target="_blank"&gt;compound engineering&lt;/a&gt;&lt;/u&gt; and &lt;u&gt;&lt;a href="https://www.youtube.com/watch?v=oEvjbPwGwnc" rel="noopener noreferrer" target="_blank"&gt;writing with AI&lt;/a&gt;&lt;/u&gt; to share what we’ve learned from training teams at companies like the &lt;em&gt;&lt;u&gt;&lt;a href="https://every.to/on-every/the-next-chapter-of-every-consulting" rel="noopener noreferrer" target="_blank"&gt;New York Times&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;&lt;u&gt;&lt;a href="https://every.to/on-every/the-next-chapter-of-every-consulting" rel="noopener noreferrer" target="_blank"&gt; and leading hedge funds&lt;/a&gt;&lt;/u&gt;, and by using and experimenting with AI every day ourselves.&lt;/p&gt;&lt;h5&gt;Upcoming event&lt;/h5&gt;&lt;ul&gt;&lt;li&gt;&lt;u&gt;&lt;a href="https://every.to/events/executive-AI-sessions" rel="noopener noreferrer" target="_blank"&gt;Executive AI Sessions&lt;/a&gt;&lt;/u&gt;: On June 2, head of consulting &lt;u&gt;&lt;a href="https://every.to/@natalia_2944" rel="noopener noreferrer" target="_blank"&gt;Natalia Quintero&lt;/a&gt;&lt;/u&gt; hosts a live webinar introducing &lt;u&gt;&lt;a href="https://every.to/consulting" rel="noopener noreferrer" target="_blank"&gt;Every Consulting&lt;/a&gt;&lt;/u&gt;’s new offering for leadership teams navigating AI adoption—built on the playbook we’ve been running with executive clients for months. &lt;u&gt;&lt;a href="https://every.to/events/executive-AI-sessions" rel="noopener noreferrer" target="_blank"&gt;Learn more and register&lt;/a&gt;&lt;/u&gt;.&lt;/li&gt;&lt;/ul&gt;&lt;h5&gt;In New York City&lt;/h5&gt;&lt;ul&gt;&lt;li&gt;&lt;u&gt;&lt;a href="https://luma.com/2o67t7ob" rel="noopener noreferrer" target="_blank"&gt;Every 🤝 IRL&lt;/a&gt;&lt;/u&gt;: Join us at the Every brownstone in Brooklyn on June 3 during New York Tech Week for a subscriber-only meetup celebrating the Every community over drinks and conversation. &lt;u&gt;&lt;a href="https://luma.com/2o67t7ob" rel="noopener noreferrer" target="_blank"&gt;Learn more and RSVP&lt;/a&gt;&lt;/u&gt;.&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;h2&gt;Mini-Vibe Check: Figma agent&lt;/h2&gt;&lt;h3&gt;Figma makes the blank canvas less blank&lt;/h3&gt;&lt;p&gt;In March 2026, &lt;strong&gt;Figma&lt;/strong&gt; &lt;u&gt;&lt;a href="https://www.figma.com/blog/the-figma-canvas-is-now-open-to-agents/" rel="noopener noreferrer" target="_blank"&gt;opened its canvas&lt;/a&gt;&lt;/u&gt; to outside AI agents. The update let coding tools like &lt;u&gt;&lt;a href="https://every.to/vibe-check/vibe-check-claude-cowork-is-claude-code-for-the-rest-of-us" rel="noopener noreferrer" target="_blank"&gt;Claude Code&lt;/a&gt;&lt;/u&gt;, &lt;u&gt;&lt;a href="https://every.to/vibe-check/cursor" rel="noopener noreferrer" target="_blank"&gt;Cursor&lt;/a&gt;&lt;/u&gt;, and &lt;u&gt;&lt;a href="https://every.to/vibe-check/vibe-check-codex-openai-s-new-coding-agent" rel="noopener noreferrer" target="_blank"&gt;Codex&lt;/a&gt;&lt;/u&gt; connect to Figma through MCP (model context protocol, the open standard that lets AI agents talk to external software) and write designs directly into a Figma file. &lt;/p&gt;&lt;p&gt;Today, Figma releases &lt;u&gt;&lt;a href="https://www.figma.com/blog/the-figma-agent-is-here/" rel="noopener noreferrer" target="_blank"&gt;its own agent&lt;/a&gt;&lt;/u&gt; that lives inside Figma. It edits your canvas directly—switching component states (the variants of a design element, like when a button looks one way when hovered and another when clicked), restyling layouts, and generating new screens. It’s built on a mix of Google’s &lt;u&gt;&lt;a href="https://every.to/vibe-check/vibe-check-gemini-2-5-pro-and-gemini-2-5-flash" rel="noopener noreferrer" target="_blank"&gt;Gemini Flash&lt;/a&gt;&lt;/u&gt;, Anthropic’s &lt;u&gt;&lt;a href="https://every.to/vibe-check/vibe-check-claude-sonnet-4-5" rel="noopener noreferrer" target="_blank"&gt;Claude Sonnet&lt;/a&gt;&lt;/u&gt;, and Figma’s own fine-tuned models. Figma users no longer have to leave their canvas, or hand the work off to an engineer, to get an AI-generated first draft.&lt;/p&gt;&lt;p&gt;Every got access a day before the announcement. Head of marketing &lt;strong&gt;Douglas Brundage&lt;/strong&gt;, senior designer&lt;strong&gt; &lt;u&gt;&lt;a href="https://every.to/@daniel_5fbd21_1" rel="noopener noreferrer" target="_blank"&gt;Daniel Rodrigues&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;, and creative designer &lt;strong&gt;Benjamin Ose&lt;/strong&gt; spent a day testing it. Here’s what they found. &lt;/p&gt;&lt;h3&gt;What works&lt;/h3&gt;&lt;p&gt;When the prompt is specific, the agent produces solid early explorations, preserves copy well, and gives designers something to work with instead of a blank canvas. &lt;/p&gt;&lt;p&gt;As Daniel put it, “There’s really no excuse to start from scratch anymore.” &lt;/p&gt;&lt;div class="quill-block-image" id="quill-block-image-1779299189666" data-source="{&amp;quot;dom_id&amp;quot;:&amp;quot;quill-block-image-1779299189666&amp;quot;,&amp;quot;link&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4269/optimized_6b997d11-7467-446b-b47e-e4f203ba324d.png&amp;quot;,&amp;quot;image&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4269/optimized_6b997d11-7467-446b-b47e-e4f203ba324d.png&amp;quot;,&amp;quot;caption&amp;quot;:&amp;quot;The agent can explore visual directions quickly, though fidelity and rendering still need designer review. (Image courtesy of Douglas Brundage.)&amp;quot;,&amp;quot;error&amp;quot;:null}"&gt;&lt;div&gt;&lt;a href="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4269/optimized_6b997d11-7467-446b-b47e-e4f203ba324d.png" target="_blank" rel="noopener noreferrer"&gt;&lt;img src="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4269/optimized_6b997d11-7467-446b-b47e-e4f203ba324d.png" alt="The agent can explore visual directions quickly, though fidelity and rendering still need designer review. (Image courtesy of Douglas Brundage.)"&gt;&lt;/a&gt;&lt;figcaption class="quill-image-caption"&gt;The agent can explore visual directions quickly, though fidelity and rendering still need designer review. (Image courtesy of Douglas Brundage.)&lt;/figcaption&gt;&lt;/div&gt;&lt;/div&gt;&lt;p&gt;&lt;br&gt;&lt;/p&gt;&lt;p&gt;It’s also good for quickly sketching out product ideas. Benjamin used it to mock up a SaaS dashboard for mining X mentions for testimonials and came away with viable early explorations. Here was his initial prompt: &lt;/p&gt;&lt;blockquote&gt;Design a SaaS dashboard that listens for your X handle mentions, uses AI to extract testimonials (positive shouts, reviews, endorsements), and stores them in a searchable vault. One-click export to websites as embeds, widgets, or APIs—think Grammarly’s clean proofing flow meets Stripe’s embeddable elements. Freemium entry: Basic capture free, premium for AI curation and analytics.&lt;/blockquote&gt;&lt;div class="quill-block-image" id="quill-block-image-1779299231296" data-source="{&amp;quot;dom_id&amp;quot;:&amp;quot;quill-block-image-1779299231296&amp;quot;,&amp;quot;link&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4269/optimized_71af90ec-f87e-4333-961c-b33e9aa3c1af.png&amp;quot;,&amp;quot;image&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4269/optimized_71af90ec-f87e-4333-961c-b33e9aa3c1af.png&amp;quot;,&amp;quot;caption&amp;quot;:&amp;quot;Benjamin used the agent to come up with a testimonial-mining SaaS dashboard, producing a structured early exploration ready for cleanup and iteration. (Image courtesy of Benjamin Ose.)&amp;quot;,&amp;quot;error&amp;quot;:null}"&gt;&lt;div&gt;&lt;a href="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4269/optimized_71af90ec-f87e-4333-961c-b33e9aa3c1af.png" target="_blank" rel="noopener noreferrer"&gt;&lt;img src="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4269/optimized_71af90ec-f87e-4333-961c-b33e9aa3c1af.png" alt="Benjamin used the agent to come up with a testimonial-mining SaaS dashboard, producing a structured early exploration ready for cleanup and iteration. (Image courtesy of Benjamin Ose.)"&gt;&lt;/a&gt;&lt;figcaption class="quill-image-caption"&gt;Benjamin used the agent to come up with a testimonial-mining SaaS dashboard, producing a structured early exploration ready for cleanup and iteration. (Image courtesy of Benjamin Ose.)&lt;/figcaption&gt;&lt;/div&gt;&lt;/div&gt;&lt;p&gt;&lt;br&gt;&lt;/p&gt;&lt;h3&gt;What needs work&lt;/h3&gt;&lt;p&gt;The agent is less useful for detailed work. Tabs rendered improperly, buttons doubled up, components drifted out of alignment, and some outputs came back weirdly low-res. It can lay down the structure, but the designer still has to go in and fix the details. There’s no ability to attach an image or a link as a visual reference for the agent. Right now the agent relies on a prompt-writing skill or an existing Figma frame. &lt;/p&gt;&lt;p&gt;Benjamin also said the agent would be much more useful if it worked from an existing design system, instead of inventing from scratch—pulling in the components, colors, spacing, and styles a team already uses in Figma. Ideally, it could also draw on the reference tools designers use, like &lt;u&gt;&lt;a href="https://mobbin.com/" rel="noopener noreferrer" target="_blank"&gt;Mobbin&lt;/a&gt;&lt;/u&gt;.&lt;/p&gt;&lt;h3&gt;Our verdict&lt;/h3&gt;&lt;p&gt;Figma’s agent isn’t a fully trustworthy design copilot yet, but it solves the blank-page problem for early design work. Its job is to get designers from zero to first pass, so their energy can shift to judgment and polish.&lt;/p&gt;&lt;p&gt;It delivers on that promise for exploration, layout starts, and iteration. It still needs better fidelity, stronger detail handling, and richer reference inputs before it can feel dependable in production.—&lt;em&gt;&lt;u&gt;&lt;a href="https://every.to/@katie.parrott12" rel="noopener noreferrer" target="_blank"&gt;Katie Parrott&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;&lt;strong&gt;&lt;em&gt;&lt;u&gt;&lt;a href="https://every.to/@jackcheng" rel="noopener noreferrer" target="_blank"&gt;Jack Cheng&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;&lt;/strong&gt;&lt;em&gt; is a senior editor at Every. He is a creative generalist and the author of two novels for young readers. You can follow him on &lt;a href="https://x.com/jackcheng" rel="noopener noreferrer" target="_blank"&gt;X&lt;/a&gt; or read his occasional &lt;u&gt;&lt;a href="https://jackcheng.com/sunday" rel="noopener noreferrer" target="_blank"&gt;Sunday&lt;/a&gt; newsletter&lt;/u&gt;.&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;To read more essays like this, subscribe to &lt;u&gt;&lt;a href="https://every.to/subscribe" rel="noopener noreferrer" target="_blank"&gt;Every&lt;/a&gt;&lt;/u&gt;, and follow us on X at &lt;u&gt;&lt;a href="http://twitter.com/every" rel="noopener noreferrer" target="_blank"&gt;@every&lt;/a&gt;&lt;/u&gt; and on &lt;u&gt;&lt;a href="https://www.linkedin.com/company/everyinc/" rel="noopener noreferrer" target="_blank"&gt;LinkedIn&lt;/a&gt;&lt;/u&gt;.&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;We &lt;u&gt;&lt;a href="https://every.to/studio" rel="noopener noreferrer" target="_blank"&gt;build AI tools&lt;/a&gt;&lt;/u&gt; for readers like you. Write brilliantly with &lt;/em&gt;&lt;strong&gt;&lt;em&gt;&lt;u&gt;&lt;a href="https://writewithspiral.com/" rel="noopener noreferrer" target="_blank"&gt;Spiral&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;&lt;/strong&gt;&lt;em&gt;. Organize files automatically with &lt;/em&gt;&lt;strong&gt;&lt;em&gt;&lt;u&gt;&lt;a href="https://makeitsparkle.co/?utm_source=everyfooter" rel="noopener noreferrer" target="_blank"&gt;Sparkle&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;&lt;/strong&gt;&lt;em&gt;. Deliver yourself from email with &lt;/em&gt;&lt;strong&gt;&lt;em&gt;&lt;u&gt;&lt;a href="https://cora.computer/" rel="noopener noreferrer" target="_blank"&gt;Cora&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;&lt;/strong&gt;&lt;em&gt;. Dictate effortlessly with &lt;/em&gt;&lt;strong&gt;&lt;em&gt;&lt;u&gt;&lt;a href="https://monologue.to/" rel="noopener noreferrer" target="_blank"&gt;Monologue&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;&lt;/strong&gt;&lt;em&gt;. Collaborate with agents on documents with &lt;/em&gt;&lt;strong&gt;&lt;em&gt;&lt;a href="https://www.proofeditor.ai/" rel="noopener noreferrer" target="_blank"&gt;Proof&lt;/a&gt;&lt;/em&gt;&lt;/strong&gt;&lt;em&gt;. &lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;For sponsorship opportunities, reach out to sponsorships@every.to.&lt;/em&gt;&lt;/p&gt;&lt;div class="quill-button" data-source="{&amp;quot;id&amp;quot;:&amp;quot;quill-button-1769187301610&amp;quot;,&amp;quot;url&amp;quot;:&amp;quot;https://every.to/subscribe?source=post_button&amp;quot;,&amp;quot;text&amp;quot;:&amp;quot;Subscribe&amp;quot;}" id="quill-button-1769187301610"&gt;&lt;a href="https://every.to/subscribe?source=post_button"&gt;Subscribe&lt;/a&gt;&lt;/div&gt;</description>
      <author>Jack Cheng / Context Window</author>
      <pubDate>2026-05-20 13:00:00 -0400</pubDate>
      <guid>https://every.to/context-window/google-i-o-agents-agents-agents</guid>
      <link>https://every.to/context-window/google-i-o-agents-agents-agents</link>
    </item>
    <item>
      <title>Inside Stainless, The Developer Tools Startup Anthropic Just Bought for $300 Million</title>
      <description>&lt;table&gt;&lt;tr&gt;&lt;td&gt;&lt;img alt="AI &amp;amp; I" src="https://d24ovhgu8s7341.cloudfront.net/uploads/publication/logo/97/small_ai_and_i_cover_1.png" /&gt;&lt;/td&gt;&lt;td&gt;&lt;/td&gt;&lt;td&gt;&lt;table&gt;&lt;tr&gt;&lt;td&gt;by &lt;a href="https://every.to/@danshipper" itemprop="name"&gt;Dan Shipper&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;in &lt;a href="https://every.to/podcast"&gt;AI &amp;amp; I&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;p&gt;&lt;strong&gt;The transcript of &lt;em&gt;&lt;u&gt;&lt;a href="https://every.to/podcast" rel="noopener noreferrer" target="_blank"&gt;AI &amp;amp; I&lt;/a&gt;&lt;/u&gt;&lt;/em&gt; with Stainless CEO Alex Rattray is below. Watch on X or YouTube, or listen on Spotify or Apple Podcasts. [Disclosure: I’m a small investor in Stainless.]&lt;/strong&gt;&lt;/p&gt;&lt;h3&gt;&lt;strong&gt;Timestamps&lt;/strong&gt;&lt;/h3&gt;&lt;ol&gt;&lt;li&gt;Introduction: 00:01:15&lt;/li&gt;&lt;li&gt;APIs and MCP, the connectors of the new internet: 00:05:09&lt;/li&gt;&lt;li&gt;Why MCP exists: 00:11:00&lt;/li&gt;&lt;li&gt;Why MCP servers are hard to get right: 00:17:15&lt;/li&gt;&lt;li&gt;Design principles for reliable MCP servers: 00:20:24&lt;/li&gt;&lt;li&gt;Using MCP for business ops at Stainless: 00:25:06&lt;/li&gt;&lt;li&gt;Alex’s take on the security model for MCP: 00:40:57&lt;/li&gt;&lt;li&gt;How one-off AI actions become permanent production software: 00:44:42&lt;/li&gt;&lt;/ol&gt;&lt;p&gt;&lt;strong&gt;Dan Shipper&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;The internet runs on computers talking to each other, but its entire architecture was built for a pre-AI world. Now we’re trying to hook AI up to the internet with MCP—Model Context Protocol—which turns any website or web service into a set of tools that an AI can use natively to get work done. And the software companies that learn how to do MCP well are going to win over the next decade.&lt;/p&gt;&lt;p&gt;That’s why I brought Alex Rattray, the founder and CEO of Stainless, onto the show. Stainless’s job is to help computers talk to each other. They make the APIs and SDKs for all the big companies you know about, like OpenAI and Anthropic, and they’re starting to build MCP servers too. Alex and I get into the nitty-gritty of what the future of MCP looks like, how to design good MCPs, why MCPs are actually really hard to scale and possibly insecure, and we try to figure out together what a better model for allowing AIs to use the internet might look like.&lt;/p&gt;&lt;p&gt;This is a great episode. Alex is a good friend of mine. Let’s dive in.&lt;/p&gt;&lt;p&gt;Alex, welcome to the show.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Alex Rattray&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;Thanks, Dan. It’s really exciting to be here.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Dan Shipper&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;It’s good to have you. For people who don’t know, you are the founder and CEO of Stainless, which is the API company. You make APIs for companies like OpenAI and Anthropic—just name your big company that you might use their API, and Stainless is probably behind it. Before that you worked at Stripe doing their API, which makes total sense. And before that, most importantly, we were very good friends in college and have remained good friends. We were both starting companies in college. I’m a tiny investor in Stainless. It’s been really fun to watch your journey and get to hang out together so much over the years, and I’m just very excited to bring you on to talk about AI and what you’re doing at Stainless.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Alex Rattray&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;Thanks, Dan. It’s been really fun over the years. When we were in college, I was working on a startup and you were working on a startup. You had a conference room at a venture capitalist office as your office, and you let me crash there with my co-founder and team. We were just on the other side of the conference table hacking away into the evening. Very fond memories of those days. And these days it’s not every evening, but on the weekends, whatever—the same thing is still happening. You don’t see that every day, and it’s a really nice feeling. It’s been great to see everything happening with Every along the way.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Dan Shipper&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;Thank you. As I say, I started from the bottom, now we’re here.&lt;/p&gt;&lt;p&gt;The thing I always say when I run into people and they ask me about you—in order to embarrass you—is that you’re the only person I know of who has consistently run barefoot through the streets of Philadelphia. When we first met, you were not a fan of shoes and you were a fan of running. You want to talk about that?&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Alex Rattray&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;It wasn’t that I didn’t like the concept of shoes—it’s that I couldn’t find a good pair. At a certain point, I was running through Nikes and they would bust open every few months. I think what was actually going on is that I had really wide feet and was probably buying narrow shoes. Shoes would constantly get ruined, and on a college budget it’s just like, “This is no good.” Eventually I decided, okay, the longer you wear your shoes, the more worn out they get, but the longer you just wear your feet, the tougher they get.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Dan Shipper&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;“The longer you wear your feet.”&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Alex Rattray&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;Try it out. Try this at home. What could go wrong? I actually currently have a really annoying splinter in one of my feet—so don’t actually try this at home. But—&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Dan Shipper&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;Are you still running barefoot?&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Alex Rattray&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;No, no. This is just from around the house.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Dan Shipper&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;Dangerous.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Alex Rattray&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;Yeah. But see, that’s the thing. If I had been going around on the asphalt without socks on, my feet would’ve been tougher and I’d have no splinter.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Dan Shipper&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;So when you’re not running barefoot, you’re running Stainless. You’re around 50 people now, right?&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Alex Rattray&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;Just about, yeah.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Dan Shipper&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;That’s pretty wild. You started Stainless in a pre-AI world, and now we’re in an AI world, and I think you have some ideas for what the future of AI is going to be and how APIs fit into that, how MCPs fit into that. Do you want to paint a little picture for us about where we’re going?&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Alex Rattray&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;I would love to. To start—what’s an API? Not everybody’s familiar with that. It stands for application programming interface. There will not be a quiz, right, Dan?&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Dan Shipper&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;No quizzes.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Alex Rattray&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;Great. Basically, it’s how one computer program talks to another computer program. It’s how computers talk to computers, how apps talk to apps. APIs are the dendrites of the internet. Dendrites are where your neurons connect and actually exchange information with each other. If you have two neurons in your brain but they’re not talking to each other, you’re actually not thinking. There is no thought happening in a brain without connections between neurons.&lt;/p&gt;&lt;p&gt;And if you think about the internet—if all these servers in the cloud weren’t talking to each other, you wouldn’t have internet. Programs, internet software, does nothing without APIs, without connections to other programs. It’s really fundamental to the mesh of pretty much all modern software. Everything we think of when we think about technology—APIs are at the heart and center of that, just like dendrites are the center of the mesh of the brain and how we think.&lt;/p&gt;&lt;p&gt;Stainless’s mission from day one was to make it easier for computers to talk to computers. The long-running trend of technology is toward more automation. APIs are how most business-to-business interactions, in some format or another, become real, become automated.&lt;/p&gt;&lt;p&gt;What we see with the rise of AI is that a new computer has entered the chat. There’s a new kind of system that can talk to other systems—or at least we’d like it to be able to. You used to have either humans interacting with a computer through a user interface, or a computer interacting with a computer through an API. Now we have LLMs interacting with computers. What’s that through?&lt;/p&gt;&lt;p&gt;Anyone familiar with Every and who’s a regular listener will know MCP—Model Context Protocol—which is a system for connecting LLMs to computers broadly speaking. It’s an area we’re investing in at Stainless. It’s really part of our core mission of making it easy for computers to talk to computers.&lt;/p&gt;&lt;p&gt;The core product we first brought to market is software development kits, SDKs. These are ways of saying, “Okay, Stripe has this great REST API. You can send JSON over HTTP and get back JSON over HTTP. And if you want that to be really convenient, you’re going to use the Stripe Python library, the Stripe Python SDK.” If you’re a Python developer, you’ll go pip install stripe, and then in your application code you’ll write stripe.customers.create, and all of a sudden you have a nice new customer object in your Stripe database and you’re off to the races. Or stripe.charges.create in the old days, to charge a credit card.&lt;/p&gt;&lt;p&gt;SDKs give developers that easy way to interface with an API. What’s the thing that gives LLMs an easy way to interface with an API? You might say MCP, and in a sense you’d be right. But what we’re seeing so far as MCP rolls out into the world and people experiment with it is that it’s not working so great. It’s difficult to deliver on what I see as the core vision of what’s so exciting about MCP.&lt;/p&gt;&lt;p&gt;A dashboard and a user interface lets you click around, see a bunch of stuff, fill out forms, click buttons, do things—anything you’d do while interacting with software, you do through the UI. But LLMs interacting through MCP tend to be much more restricted. You can only do a few little things. There’s usually not a ton of tools you’re going to be exposing to the models.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;(00:10:00)&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Dan Shipper&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;Just to stop you there—what I’m hearing you say is that just like a website is built for humans to use, MCP is sort of the equivalent for models. You can think of it as exposing a set of tools the model can use to perform certain functions. Just like you might click a button on a website, MCP gives the model a bunch of things it can click on or use to get work done.&lt;/p&gt;&lt;p&gt;An example might be a Gmail MCP that has a send mail tool, a compose mail tool, a read inbox tool—that kind of thing. And instead of a human going on the Gmail website and doing it, the LLM is essentially logging in and using it itself. It’s a native interface for language models. But you’re saying that’s not working that well. Can you tell me more?&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Alex Rattray&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;Let’s start with what I see as the big vision of MCP and, in some sense, the big vision of agentic AI in the first place. I’ll start with the most pedestrian example you can imagine.&lt;/p&gt;&lt;p&gt;Let’s say Dan walks into my store and buys a pair of stripey socks and maybe a few other things. The next day I hear back from Dan that there’s something wrong. It happens, you know? I turn to someone on my team and say, “Hey, can we refund Dan for those stripey socks he bought yesterday and send him a discount code for next time with a little thank-you note, because we like to take care of our customers?”&lt;/p&gt;&lt;p&gt;This is the most normal thing to do in software—some little task like this. What the member of my team would be doing is opening up their internal admin and looking around. They might go to the Stripe dashboard and look through the list of payments or transactions or orders to find one that has someone named Dan. Which Dan? There might be a bunch. Look through the list of products in the order to see whether there were stripey socks in there. That might be a few clicks. Find the right one, then go to the screen where you can create a refund, create the refund, make sure it’s the right amount, then go and create the discount, then take that discount code and send it over to some other SaaS app to send the mail automatically.&lt;/p&gt;&lt;p&gt;Of course, in a business-to-business context, you might be going into Salesforce and sending a Slack message to an account manager, so on and so forth. In the normal course of work, it’s just the most normal thing in the world—having one task involve going through five different apps, each time 15 different clicks and scrolls and loading spinners, just to do one simple thing.&lt;/p&gt;&lt;p&gt;The promise of agentic AI is to take that same prompt and type it into ChatGPT or Claude or whatever, say, “Hey, can you help refund my friend Dan?” and just have the AI go off and do that—go through these five different apps and the 15 different screens and the various button presses to complete the task and then come back and say, “Great, it’s done.”&lt;/p&gt;&lt;p&gt;In order to do that—and there are only so many tool calls you have to make as an AI model to perform that exact linear chain of events, so it’s somewhat tractable—but if you think about this in the general case, you want your agentic AI to be able to do anything that human operator would have done, without having to wait for a bunch of JavaScript to load on a website or anything like that.&lt;/p&gt;&lt;p&gt;That means you need not only the Stripe create refund tool and the Stripe list transactions tool and the Stripe list products and lookup customer and create discount tool—you need not only those tools, but you need everything you can do in the Stripe dashboard, which is basically everything you can do in the Stripe API. And that’s actually a lot. There are hundreds of different endpoints in the Stripe API. The Stripe dashboard is massive. It’s a huge application.&lt;/p&gt;&lt;p&gt;If you were to take that list of tools today and go to an LLM and say, “Hey, here’s our MCP definition for all of this. Here’s a create refund tool, here’s a create transactions tool,” so on and so forth, and tell it all about those tools—all the descriptions, all the different request properties, the response properties, all the documentation—everyone listening already knows: you’ve just burned through your entire context budget. That’s hundreds of thousands of tokens just in pretty much translating the Stripe OpenAPI spec directly over to MCP tools. Today’s models not only can’t handle that amount of context, it’s a poor use of context because you have a lot else going on. But it’s also just confusing to the model. It’s too much to hold in your brain at one time.&lt;/p&gt;&lt;p&gt;And that’s just the Stripe part of it. What you’re really trying to do is enable your operators to do anything they would normally do. And that spans many, many different SaaS tools. In the course of one interaction, it might be five. In the next interaction, it might be a different five. If you think about every single SaaS tool your business uses on a daily basis to get work done—ideally you’d want every single one of those tools exposed to your operators in their AI chat, with every single tool available, with every nook and cranny and corner case available, so you can do anything through AI. That’s the vision.&lt;/p&gt;&lt;p&gt;There are a lot of problems with that. The biggest is this context window limit. But you also have all sorts of security and permissions problems, because you don’t want the AI to color outside the lines and say, “In addition to refunding Dan’s socks, I also refunded every customer for all transactions ever. And then I sent a bunch of money to my own AI bank account.” There’s more to the challenge, but that’s the vision.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Dan Shipper&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;I think the place we started was you saying it’s not working. But I don’t think that’s the reason it’s not working today. Is that the reason why it’s not working today?&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Alex Rattray&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;What people do with MCP today is sometimes try to expose all parts of their API. The way people generally build MCP tools is they have an underlying API—usually a REST API—and they wrap different parts of it, different endpoints, different operations, in MCP tools. You can do that in a one-to-one mapping, or you can kind of handcraft things for the MCP. Today, in order to succeed, people are finding you really have to handcraft it to the MCP, to the LLMs. You have to say, “Okay, I’m making one specialized tool to look up a customer and refund their transaction based on a description.”&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Dan Shipper&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;So there are all these decisions you have to make where you need to have the ergonomics of the model in mind—how the model thinks—in order to make sure the model does the right thing more often than not.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Alex Rattray&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;Yeah, it’s hard. I use this SDK analogy sometimes. It took a long time for humanity to get to the point where we could make a really good Python SDK for a developer wrapping an API. I think we’ve cracked that nut. Stainless offers really great Python libraries, but we’re building on the shoulders of giants here. We haven’t figured out how to expose an API ergonomically to an LLM in the same way we’ve figured out how to expose it ergonomically to a Python developer. That’s a new research problem in a sense.&lt;/p&gt;&lt;p&gt;And it’s harder because I can go learn how to be a Python developer if I want. I can’t really learn how to think or see like an LLM. That makes it tricky.&lt;/p&gt;&lt;p&gt;We do have at Stainless some things we’re cooking up to address some of these problems. LLMs have a really hard time with a repeated, sustained chain of actions. Even if you get an API response back for “list all the transactions,” there’s so much data, and you might have to go through the next page and the next page to find the one that has Dan with the stripey socks. That’s again a ton of context with one or two small needles in the haystack. LLMs are pretty good at that, but not perfect—and with too much hay, we all end up throwing up our hands. That’s true for LLMs too.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Dan Shipper&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;When you’re building MCP servers for people—and when you see people doing it well today—what are the principles? How do you think about making an MCP server that one, people use, which is actually a big one, and two, when it is used, actually does the right job?&lt;/p&gt;&lt;p&gt;&lt;strong&gt;(00:20:00)&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Alex Rattray&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;There have been relatively few times I’ve seen it done well. I have seen it done well. We’re cooking something up that I’m really excited about. But with today’s technology, you really have to do a good job of product management. You have to go out into the market, talk to your customers, see what their actual needs are, look over their shoulders as they use and operate your software, and think about what you could unlock through AI where people would be doing things they can’t really do with your software today—because it just got so much easier. Then you have to do a lot of engineering work to wrap it up in a bow that works for the models.&lt;/p&gt;&lt;p&gt;You have to set up a really good system for evals, and if you’re doing MCP, you have to think about the different clients people might be using. Are they using Cursor? Are they using Claude Code? Something else? And the different models underlying all that. You end up with a pretty crazy matrix of things to optimize for and ways to evaluate whether what you’re offering is working well.&lt;/p&gt;&lt;p&gt;It’s also kind of a black box to get that feedback back to your servers so you can find out: we gave a tool call response here, was it actually any good? Did the user like it? Was the LLM able to use it? That’s a problem I haven’t seen a lot of people solve yet. Thinking about that as a first-class thing—maybe you have a send feedback tool, which is something we’ve been thinking about—so that if a user says out loud in the chat, “Oh man, that was useless garbage,” at least the MCP server finds out about that.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Dan Shipper&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;Is there anything more concrete you’ve learned about how to design a good MCP server—beyond the obvious stuff about talking to customers and thinking about use cases?&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Alex Rattray&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;You want to keep the number of tools relatively small. You want the tool name and the description to be really precise and specific.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Dan Shipper&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;Aren’t those two things at odds?&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Alex Rattray&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;Yes. Good writing is hard. You can make a great tool that looks up a person by name and product description and then refunds them. You also want a small number of properties in the input schema—a small number of parameters, concisely described but sufficiently described. This is also hard. You want the response data to come back with very little data—only exactly what the model will need. That’s also very hard because you may not know a priori which things the model is really looking for.&lt;/p&gt;&lt;p&gt;We have a technique we use in our MCP servers today where we give the model a JQ filter, which is a way of filtering out JSON, and that can work pretty well. But that’s kind of a special trick.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Dan Shipper&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;Doesn’t this mean that MCP just needs another level like a search tool—search, like, find a list of relevant tools given my task?&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Alex Rattray&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;The tool browsing problem is definitely a serious one, and that is one approach. We actually do this at Stainless today, where you can get an MCP server for your API that just has, like I was saying earlier, the very simple thing of every endpoint exposed as a tool. If you have a small API, that works great. You can also filter it, so you expose an MCP server with only a small subset of your endpoints. That works great.&lt;/p&gt;&lt;p&gt;You can also use what we call dynamic mode, where there are three tools no matter how big your API is. One is list endpoints, another is get endpoint and learn about it, and the last one is execute endpoint. That enables the context thing to scale really well, but it means three turns of the model just to do one thing. So that gets slower. It’s more expensive in another sense, and there’s some lossiness. It performs pretty well usually, but not quite as well because the tools aren’t loaded up in quite the same way.&lt;/p&gt;&lt;p&gt;Are you using MCP servers yourself?&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Dan Shipper&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;Yeah.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Alex Rattray&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;Funnily enough, not so much on the coding side—I use it on the business side. I’ll use the Notion, HubSpot, and Gong MCP servers and an MCP server for our database—a read-only copy—and say, “Hey, what are the interesting customers that signed up for Stainless last week?” It’ll go off and make a great query of our Postgres database, cross-reference those things in HubSpot, look up our notes in Notion, maybe even look at transcripts in Gong, and tell me all about it. It’s incredible.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;(00:30:00)&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Dan Shipper&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;And so that’s one of your big use cases. How often are you doing that? I’m now interested—not even from an MCP perspective, but for anyone running a business with some complexity who wants to know what’s going on. What are you actually doing, what is the report that comes out, and how often? Tell me so I can steal it.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Alex Rattray&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;For me it’s still usually in kind of playing-around mode. One of the things is the MCP servers disconnect, and then I get annoyed. You have to reconnect, which is not a huge deal, but there are a lot of little paper cuts still in technology this new that can hold back some amount of usage.&lt;/p&gt;&lt;p&gt;One thing I found really helpful at the meta level—and I’m sure you’ve had other guests talk about this—is the practice of just collecting notes for the AI by the AI, then edited and curated by yourself. I have a notes folder, a research folder, something like that in a special Git repo that I use just for this sort of internal stuff. I tell the AI: “When you find interesting customer quotes, put them in this folder and give the full citation,” so that the next time I start asking interesting questions, it doesn’t have to go searching through the MCP servers again. It has them cached in markdown files on disk.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Dan Shipper&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;Wait, that’s crazy. What are you using to write into that Git repo? Is it Claude Code? ChatGPT? How does it get in there?&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Alex Rattray&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;I use Claude Code these days for that kind of thing.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Dan Shipper&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;So you just have Claude Code open and running, and then a new customer testimonial comes in and you’re like, “Hey, can you throw this into my master company Git knowledge repository?” And then whenever you need anything later you’re like, “Claude, go search through my master repository to figure out where the best customer quote is for this.”&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Alex Rattray&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;Totally.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Dan Shipper&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;That’s so cool. What kind—can we see it?&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Alex Rattray&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;No, it’s too messy and probably has a lot of confidential information—the latter being more important.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Dan Shipper&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;When you say it’s messy, are you having Claude organize it at all? How is it structured?&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Alex Rattray&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;There’s a lot that I want to do here that we haven’t had the chance to do yet. There’s some lower-hanging fruit that our business team is working through right now, just on the basics of your CRM systems and so on. It’s not well-structured now, but I think that’s fine. I’m not going to prioritize structuring it super well until we’re using it more broadly. I use it some of the time. One of the business people on the team uses it a fair amount. One or two of our customer support engineers use it a lot. But it’s not yet broader than that, and I’d like it to get there. Once we see how everything’s evolving, that’s when we’ll start bringing in more structure. As it is, Claude Code can handle unstructured stuff really well. You don’t have to think about it too hard in advance. You can move things around later.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Dan Shipper&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;What else do you have in there other than customer quotes?&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Alex Rattray&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;SQL queries. I’m a software developer—I don’t write a lot of code these days, but I spend a lot of time doing that. When I say, “Hey, how is our month-on-month growth of XYZ metric over the last three months?”—I did this recently for my last board prep—it came out with a pretty good answer right away, and I was like, “Wow, this is awesome.” Then I looked a little deeper and realized I actually wanted to exclude certain users from the analysis and filter it this way and that way. I imbued more business context into that SQL query and iterated with Claude Code to get it better and better for the specific metric and the specific story I was trying to tell. Then I got it to a good place and said, “Great, let’s dump this into an analytics folder for future use.”&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Dan Shipper&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;So next time you’re doing board prep, you can be like, “Hey, what was that query we did last time?” and it’ll go get it.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Alex Rattray&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;Yeah. That’s really cool.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Dan Shipper&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;What else?&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Alex Rattray&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;As any software team is doing these days—we’re using this for, “Hey, a customer comes in with a question. Can Claude Code just fix it?” In some cases, a Linear ticket gets filed, and our support engineers are really very technical. They may not have the wall clock time to chase down the fix themselves on an incoming bug. They have the technical skill, but another customer writes in two minutes later and they want to jump on that. They don’t want to be knee-deep in a debugger.&lt;/p&gt;&lt;p&gt;So sometimes what we do is file the ticket—intending to do it later, or for another engineer to do it later—but say, “Hey, can we see if Claude Code can just take a crack at it?” Is that going to work out 100% of the time? Definitely not. Is that going to work out 50% of the time? Still no, to be honest. But can that improve the overall efficiency? Yeah, maybe. We’re still experimental there, but we’re seeing a lot of promise.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Dan Shipper&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;In our pre-production call, you were talking about having a big vision for the future of AI. Do you want to walk me through that?&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Alex Rattray&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;I would love to. We talked earlier about how agentic AI can make operators’ lives a lot easier by taking certain pedestrian tasks and running with them independently. That’s something I think as an industry we’re almost on the cusp of.&lt;/p&gt;&lt;p&gt;A big part of the way I see things unfolding from here—I like to say the future of AI is cyborgs. Which is already sort of ridiculous because what is a cyborg other than a robot? But cyborg, as I understand it, is a term that means you’re part person and part machine. In this case, when you go and talk to an agent, what you’re going to be getting is part LLM neural net and part code—where the machine I’m talking about is traditional CPU software, not GPU software.&lt;/p&gt;&lt;p&gt;I think this will play out in two main ways. One is your kind of one-off operational use cases like we were talking about a minute ago, and then the other is production software.&lt;/p&gt;&lt;p&gt;In the use case where someone needs to perform some tricky one-off action with a bunch of points and clicks, and now we want an AI to just make a bunch of tool calls—the way I actually see that happening and what we’re building toward is code execution. Rather than the model having a bajillion tools, the model has two tools. One to execute code—where it just has a text box of “put in some TypeScript, and you’re going to use this API’s TypeScript SDK, and you’re going to write stripe.charges.list, stripe.customers.retrieve, stripe.refunds.create.” This is really easy for models. They’re really good at writing code.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;(00:40:00)&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;If you give that tool a little bit of a README—“here’s an example request, here are some other API calls you can make”—it’s really good at extrapolating from patterns when the SDK and the API are well-formed and predictable. Then you give it an additional tool to search the docs and ask questions of the docs. Anything it’s not sure about or gets wrong on the first try, you give it the documentation.&lt;/p&gt;&lt;p&gt;What this does for the scenario we were talking about earlier is you have very limited impact on the context window up front—we’re talking about 1,000 tokens or something like that. And the context impact of doing a whole bunch of paginated list requests? Zero. The model will go look for somebody named Dan and double-check that the purchase was stripey socks. You might write three nested for loops, but then only at the end when it found the right thing it’ll console.log “found Dan, customer ID, blah blah, transaction ID, blah blah.” Then create refund—refund ID one, two, three.&lt;/p&gt;&lt;p&gt;The context hit coming back from all of this is going to be like 10 lines of text. It’s really minimal. And all of this will run really quickly too, so you don’t have a round trip to the model every time you’re doing something like this. It’s just CPU code, and it runs in a server in the cloud right next to the Stripe API somewhere in AWS. It goes super fast.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Dan Shipper&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;What I’m understanding you to say is that the language model has a tool where it can write code and send that code to whatever API provider—Stripe, whoever’s MCP server you’re using—they’ll go and execute that code, that code is going to interact with their API, and then return the results. Rather than having 50 different possible tool calls and all that stuff, it’s just: model writes API code, API provider executes that code, runs it on their API, and returns the results.&lt;/p&gt;&lt;p&gt;Why wouldn’t my model just write the code that I then run myself instead of relying on an API provider to do it?&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Alex Rattray&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;I expect that will happen a lot more. I expect the code execution tool is going to become the most widely used tool. The problem is that today the code execution tool doesn’t work so well with libraries. LLMs have a hard time knowing exactly what version of a library they’re using, using the right version—probably usually the latest version—and not hallucinating aspects of the API, and knowing how to iterate if they hallucinate wrong.&lt;/p&gt;&lt;p&gt;And if it can’t use any library off NPM or the Python Package Index really, really well, basically perfectly out of the box, then forget about using a library. At that point you just have to hit the raw HTTP API. And in order to figure out what’s in there, you need the whole OpenAPI spec, and you’re back at square one because that document is massive.&lt;/p&gt;&lt;p&gt;Furthermore, something that’s really scary about that is if you don’t have a typed library with static typing where the computer can say what you’re trying to do is wrong, then the LLM will try to make an API request that is wrong some percentage of the time. The code execution tool can run a type checker and say, “You’re asking about stripe.transactions.list, but that actually doesn’t exist. Stripe doesn’t have a transactions API. You might want payment intents, you might want orders, you might want balance transactions. Which one do you want?”&lt;/p&gt;&lt;p&gt;And if the API provider is doing a great job building this tool, it’ll return the documentation for all of these things inline. It might have its own AI look at what the model’s trying to do and come up with a suggestion. That sub-agent is well-trained, well-specified, always updating, and isn’t burdened with the context of the full conversation.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Dan Shipper&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;What do you think of the security model?&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Alex Rattray&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;The security model is really, really interesting. This is another area where we’re really starting to think about things at Stainless, and I’m getting really excited about it—so if any listeners are really interested in this and have some ideas or want to talk, please do reach out.&lt;/p&gt;&lt;p&gt;At the end of the day, I think security has to take place at the API layer itself. Right now you see people trying to implement security by limiting what’s exposed through MCP, and that kind of makes sense—but at the end of the day, you could do anything that’s in the API under the hood.&lt;/p&gt;&lt;p&gt;What people should be doing is using OAuth with granular permissions, with proper scopes. At that point, the security happens in the right place, which is at the API layer. There are limitations to OAuth scopes and it’s pretty hard to build. It’d be nice if someone made that easy, but in my view, that direction is the right layer.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Dan Shipper&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;Going back to my earlier question—I’m thinking about the idea of having a model write code that the API provider then executes to interact with their API and returns the results. Would you ever consider just creating a code execution environment that developers use themselves? Because, for example, I’m thinking about Quora. It has all these tools. Maybe Gmail is going to build a code execution thing, but really I’d want something like what you’re talking about inside of Quora. What I’d need is a computer use tool where I control the environment, I can install different libraries in it, and it can call any API—it just needs to have network access basically.&lt;/p&gt;&lt;p&gt;You guys should build that.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Alex Rattray&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;We’re working on it.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Dan Shipper&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;Fuck yeah. You’re building it for developers who want to access MCP servers, or for people who are providing MCP servers?&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Alex Rattray&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;We’re starting with people who are providing MCP servers, but ultimately I think we’re going to need this to work such that you can give the model a code execution environment where it can hit not only the Stripe integration but also the Salesforce integration and also anything else. But not too much anything else. One of the advantages of starting where we’re starting—just one API provider—is that you ensure there are no network connections allowed out of that sandbox where we’re running the code to anything other than, in this case, api.stripe.com. That’s really critical for security for something like this.&lt;/p&gt;&lt;p&gt;There are ways to expand that bit by bit and keep things secure. It’ll take some time.&lt;/p&gt;&lt;p&gt;The other thing to point out as you see some of these generalizations is it’s not just that you want this code execution sandbox to work really well for any API, for any library—which I think we really need. You also start to see that this is just a powerful model for AI doing stuff. Sometimes you realize that the thing the AI did this one time in this one-off case is actually enduringly useful. Maybe any time a customer writes into support and says, “My socks had holes in them,” they should automatically get a refund. Maybe you want that, maybe you don’t—but there’s a lot of stuff that people do once, then twice, then three times, and then they say, “Okay, we should automate this.” That’s what software teams do all day, every day.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;(00:50:00)&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;I think we’re also going to be seeing that with AI—where the same code search tool we’re talking about, all the same prompting that will make an AI really, really good at interacting with an API in one of these code sandboxes, almost quote unquote “in its brain,” where it can write code in its head, run the code in its head, see the results, and then move forward with your task—it should be able to say, “Actually, this is enduringly useful code. Let me commit this to the repo.”&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Dan Shipper&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;Yeah, yeah. Chat is a really good interface for exploring, but sometimes you just want a dashboard. I just want to log into my Stripe dashboard and see all the stuff without having to be like, “What is my MRR?” It should just show up because I do that every day.&lt;/p&gt;&lt;p&gt;But I want to push you as a hashtag value-add investor. I think there’s a thing that happens in AI where often the first attempt at something like this, people try to be really cautious—and I’m sure your enterprise customers care about that—but the things that get adopted are often the ones willing to take the risk to be YOLO very early.&lt;/p&gt;&lt;p&gt;An example is DALL-E was totally private for a long time, and people were posting some images but you couldn’t get in. Then Stable Diffusion was just like, “Forget it, anyone can use this.” And that really started the whole image generation wave. Obviously Stable Diffusion fumbled the bag, but they had a lead for a while.&lt;/p&gt;&lt;p&gt;Same thing for Claude Code. If you look at the difference between Codex CLI and Claude Code—Claude Code was just YOLO mode. It’s super industrious. It has a sandbox, but you can just do --dangerously-skip-permissions. Codex fell way behind because first it was in the browser, so the whole thing was locked down. Then it was in the CLI, but it was really built for pair programming, so it wasn’t particularly industrious. It wouldn’t go off and do a bunch of stuff. It would get locked out of doing certain things even in full auto mode.&lt;/p&gt;&lt;p&gt;And now they’ve caught up because you can just let it do whatever you want. So I would really push you: there might be a version you could do like today or tomorrow or very soon for individual developers that would let them set up this environment that, for example, I would use immediately. I care about security, but I care a lot less than some gigantic enterprise company. And I think the people like me who are building at this scale are eventually hopefully going to be the big companies, but we’re the ones really doing the AI-first adoption, not the big companies.&lt;/p&gt;&lt;p&gt;I would love to get this in your hands. What are some of the APIs your team uses the most?&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Dan Shipper&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;Thinking about all our different products, I’m thinking right now about Cora, the email assistant. It has all the big APIs it’s using—mostly the Gmail API. You’re interacting with the assistant over chat, and it has a list of tools: archive email, draft email, send email, and so on. It categorizes your mail in certain ways.&lt;/p&gt;&lt;p&gt;I think we’d definitely try out something like this because if it ran the same way, it would make it much more flexible for us to make more tools and not break old ones. It’s really interesting.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Alex Rattray&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;In a sense, what I actually predict is that people who are quote unquote “building tools”—once we have a code execution super-tool like I’m talking about—is that the only way you really “build a tool” is with instructions, with prompts. The full power of everything you could possibly do in the Gmail API, for example—it’s all there in one tool. But sometimes you have specific tasks or specific categories of work you want to describe in a particular way, to help the LLM perform a sequence of actions as productively as possible. At that point, the only engineering work you have to do is prompt engineering.&lt;/p&gt;&lt;p&gt;We’ll see if it’s that “easy.” As we all know, prompt engineering can be really tricky. But I think that’s part of the vision.&lt;/p&gt;&lt;p&gt;That being said, we do have some pretty nifty ways with the MCP servers we generate today to help developers mix and match all the parts of the different tools underlying all the different parts of the API as they compose and write their own tools.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Dan Shipper&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;This is awesome. For people who are listening and want to know more from you or more from Stainless, where should they find you?&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Alex Rattray&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;Stainless.com is our website. At least visit stainless.com.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Dan Shipper&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;Alex, great to have you on. I can’t wait to do more of this when you have some of these new things launched. This is really, really fun—great to chat.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Alex Rattray&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;Thanks, Dan. You too.&lt;/p&gt;</description>
      <author>Dan Shipper / AI &amp; I</author>
      <pubDate>2026-05-20 13:00:00 -0400</pubDate>
      <guid>https://every.to/podcast/inside-stainless-the-developer-tools-startup-anthropic-just-bought-for-300-million</guid>
      <link>https://every.to/podcast/inside-stainless-the-developer-tools-startup-anthropic-just-bought-for-300-million</link>
    </item>
    <item>
      <title>Inside the 100-agent Software Factory</title>
      <description>&lt;table&gt;&lt;tr&gt;&lt;td&gt;&lt;img alt="Context Window" src="https://d24ovhgu8s7341.cloudfront.net/uploads/publication/logo/94/small_context_windown_1.png" /&gt;&lt;/td&gt;&lt;td&gt;&lt;/td&gt;&lt;td&gt;&lt;table&gt;&lt;tr&gt;&lt;td&gt;by &lt;a href="https://every.to/@katie.parrott12" itemprop="name"&gt;Katie Parrott&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;in &lt;a href="https://every.to/context-window"&gt;Context Window&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;figure&gt;&lt;img src="https://d24ovhgu8s7341.cloudfront.net/uploads/post/cover/4266/full_page_cover_57af9438f43da9c6-Cover_image_for_today.png"&gt;&lt;figcaption&gt;Midjourney/Every illustration.&lt;/figcaption&gt;&lt;/figure&gt;&lt;p&gt;Happy Tuesday! Today we have a mini Vibe Check on a tool for running more than 100 coding agents in parallel. Plus: how to write viral X posts using the secrets of Grok’s algorithm, why Every’s chief operating officer and head of marketing moved their agent work into public Slack channels, and what’s overtaking Markdown as the preferred format for agents.&lt;/p&gt;&lt;p&gt;&lt;em&gt;Was this newsletter forwarded to you? &lt;u&gt;&lt;a href="https://every.to/account" rel="noopener noreferrer" target="_blank"&gt;Sign up&lt;/a&gt;&lt;/u&gt; to get it in your inbox.&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;h2&gt;Mini-Vibe Check: Gas City&lt;/h2&gt;&lt;h3&gt;A glimpse of the future that’s not (yet) ready for practical use &lt;/h3&gt;&lt;p&gt;Earlier this year, prominent software engineer &lt;strong&gt;&lt;u&gt;&lt;a href="https://steve-yegge.medium.com/welcome-to-gas-town-4f25ee16dd04" rel="noopener noreferrer" target="_blank"&gt;Steve Yegge&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; published a viral Medium post about &lt;u&gt;&lt;a href="https://steve-yegge.medium.com/welcome-to-gas-town-4f25ee16dd04" rel="noopener noreferrer" target="_blank"&gt;Gas Town&lt;/a&gt;&lt;/u&gt;, an open-source tool that let developers coordinate 20 to 30 AI coding agents in parallel on the same codebase. Last week, Every’s head of tech consulting, &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@mike_2114" rel="noopener noreferrer" target="_blank"&gt;Mike Taylor&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;, got a peek at the future of multi-agent engineering with Gas Town’s successor project, &lt;u&gt;&lt;a href="https://github.com/gastownhall/gascity" rel="noopener noreferrer" target="_blank"&gt;Gas City&lt;/a&gt;&lt;/u&gt;. The project was &lt;u&gt;&lt;a href="https://steve-yegge.medium.com/welcome-to-gas-city-57f564bb3607" rel="noopener noreferrer" target="_blank"&gt;rebuilt as a toolkit&lt;/a&gt;&lt;/u&gt; with Yegge’s blessing by &lt;strong&gt;Chris Sells, &lt;/strong&gt;a long-time developer-tools veteran who grew Google’s open-source app-building toolkit, Flutter, to 3 million developers, and former Block technical lead &lt;strong&gt;Julian Knutsen&lt;/strong&gt;. Mike joined more than two dozen engineers and chief technology officers who played around with the project at a workshop in New York, with Sells and Knutsen dialing in. &lt;/p&gt;&lt;p&gt;&lt;strong&gt;TL;DR:&lt;/strong&gt; Gas City has some sharp ideas that reflect the direction software development is headed, but it’s not yet ready for prime time. &lt;/p&gt;&lt;p&gt;&lt;strong&gt;What is Gas City:&lt;/strong&gt; Running many coding agents in parallel is table stakes for developers at this point. Getting them to do anything useful requires coordination systems to hand work to each other, review each other’s output, and not step on each other’s branches—and nobody’s quite figured out how to get that right yet. “Software factories” like Gas City are one solution: an orchestration system that hands tasks to a small team of agents, routes their work, and decides what’s done. &lt;/p&gt;&lt;p&gt;Sells and Knutsen use Gas City to build Gas City: Knutsen’s Atlanta-based server runs roughly 100 agents that merge around 50 pull requests per day—the output of a small team—burning through roughly a billion tokens per day, or equal to roughly one-fifth of the English-language corpus on Wikipedia. &lt;/p&gt;&lt;p&gt;&lt;strong&gt;What works:&lt;/strong&gt; There are three ideas from the world of software engineering that Gas City is built on and are worth internalizing, even if you never touch the toolkit. &lt;/p&gt;&lt;ul&gt;&lt;li&gt;&lt;em&gt;Dark factory versus light factory:&lt;/em&gt; Parts of your work where humans and agents talk to each other (planning, design, review) stay visible can be thought of as light, and parts where agents grind through clearly defined work on their own stay in the background, in the dark. As you gain trust in the agents’ output, you can move more of your process into the dark. &lt;/li&gt;&lt;li&gt;&lt;em&gt;One pet, many cattle:&lt;/em&gt; The future of multi-agent engineering is likely organized with one persistent, named supervisor you talk to directly (Gas City calls it the “mayor”), who hands tasks to anonymous, disposable workers (the “polecats”) that do one job and shut down, so they execute their job without getting lost in context or in each other’s way. Instead of managing one hundred agents individually, you manage one conversation while the mayor does the coordinating. &lt;/li&gt;&lt;li&gt;&lt;em&gt;Multiple opinions on every code review:&lt;/em&gt; Give the same code to &lt;u&gt;&lt;a href="https://every.to/vibe-check/opus-4-7" rel="noopener noreferrer" target="_blank"&gt;Claude&lt;/a&gt;&lt;/u&gt;, &lt;u&gt;&lt;a href="https://every.to/vibe-check/vibe-check-codex-openai-s-new-coding-agent" rel="noopener noreferrer" target="_blank"&gt;Codex&lt;/a&gt;&lt;/u&gt;, and Kimi at the same time for review from multiple angles. Three different models catch different bugs than one model run three times.&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;&lt;strong&gt;What could be better:&lt;/strong&gt; In Gas City, every task spins up a fresh agent session that doesn’t remember the earlier steps, so agents waste cycles re-reading context that other agents produced and miss connections a single session would have caught. Cost is also a challenge: A six-step job can cost six times the cost of one Claude session, which adds up fast. The toolkit still feels experimental––it took a room full of experienced engineers an entire day to get it running, even with support from the instructors.&lt;/p&gt;&lt;p&gt;Beads, the task tracker powering the system, is built for agents first. It runs on the command line rather than as a visual dashboard, which is fine for agents but harder for humans, who want to see everything at a glance. So teams using Gas City in production typically pair it with Jira or Linear—placing tasks in two places instead of one. &lt;/p&gt;&lt;p&gt;Additionally, Gas City was built on the assumption that AI models need hand-holding to stay on track, but models have gotten good enough that parts of Gas City built to keep models on track, such as review loops to catch mistakes and mid-task check-ins to prevent agents from drifting, are now mostly unnecessary. Finally, Gas City uses deliberately unfamiliar words to refer to different inputs, actors, and workflows—“beads” for tasks, “polecats” for workers, “refineries” for processing steps—so it can be confusing for a team new to the tech. &lt;/p&gt;&lt;p&gt;&lt;strong&gt;Verdict:&lt;/strong&gt; 🟨 &lt;strong&gt;Mike Taylor&lt;/strong&gt;, head of tech consulting: “Learn from the ideas. Skip the toolkit for now.”&lt;/p&gt;&lt;p&gt;If you’re already running more than 10 Claude Code sessions in parallel and reading source code, Gas City is worth a look because it’s one informed opinion on how to handle that level of orchestration. For everyone else, take the ideas and wait. &lt;u&gt;&lt;a href="https://openai.com/index/open-source-codex-orchestration-symphony/" rel="noopener noreferrer" target="_blank"&gt;OpenAI’s Symphony&lt;/a&gt;&lt;/u&gt;, released a few weeks ago, is a more accessible, enterprise-ready version of a similar idea: a written set of rules that turns your existing Linear board into the dashboard the agents work from. This is more in line with the way software engineers work now and doesn’t require the behavior change that Gas City does. &lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;h2&gt;Steal this workflow&lt;/h2&gt;&lt;h3&gt;Run your X posts past Grok before you post&lt;/h3&gt;&lt;p&gt;xAI &lt;u&gt;&lt;a href="https://github.com/xai-org/x-algorithm" rel="noopener noreferrer" target="_blank"&gt;open-sourced its ranking algorithm&lt;/a&gt;&lt;/u&gt; last week, which shows the factors X considers when deciding which posts to surface in users’ For You feed. It includes a Grok-powered “banger classifier” that decides whether your post gets better distribution by scoring every post on quality and slop. So why not run the same check on yourself before you hit publish?&lt;/p&gt;&lt;ol&gt;&lt;li&gt;&lt;strong&gt;Paste your draft into Grok with X’s scoring prompt.&lt;/strong&gt; Ask Grok to return four things: quality_score, slop_score, isHighQuality (a true-or-false verdict on whether a post clears the quality bar), and topic tags. The classifier reads text, image, and video. Use this prompt: “Score this X post the way the xai-org/x-algorithm banger classifier would: return quality_score (0–1), slop_score (1–3), isHighQuality boolean, and topic tags.” &lt;/li&gt;&lt;li&gt;&lt;strong&gt;Rewrite anything that scores below 0.4 on quality (which can receive a score of between zero and one) or above one on slop (which is rated between one and three). &lt;/strong&gt;Posts that users scroll past quickly or report get penalized, while posts that drive replies and dwell time get rewarded. To move the score, lead with a stop-the-scroll first line, name a specific experience, event, or number, and cut anything readers would skim. As soon as a user scrolls past, the algorithm ranks the post as “not_dwelled” and it gets pushed down the recommendation pile.&lt;/li&gt;&lt;li&gt;&lt;strong&gt;Limit yourself to two to three posts a day.&lt;/strong&gt; The algorithm heavily discounts your fourth post and your eighth to near zero in the ranking system. It’s better to invest in fewer, scroll-stopping, engagement-generating posts than many forgettable ones.&lt;/li&gt;&lt;/ol&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;h2&gt;Signal&lt;/h2&gt;&lt;h3&gt;HTML is the new Markdown &lt;/h3&gt;&lt;p&gt;&lt;strong&gt;What happened: &lt;/strong&gt;Until a few weeks ago, Markdown, a lightweight text formatting system, was the be-all-end-all of documentation for AI agents, because agents had been trained on so much of it that they read and write it fluently. Then, on May 8, Anthropic’s &lt;strong&gt;Thariq Shihipar&lt;/strong&gt; published an X post titled, &lt;u&gt;&lt;a href="https://thariqs.github.io/html-effectiveness/" rel="noopener noreferrer" target="_blank"&gt;“The Unreasonable Effectiveness of HTML,”&lt;/a&gt;&lt;/u&gt; that argued agents should produce single-file HTML instead when they create files. The post hit 4.4 million views in 16 hours. Three days later, &lt;strong&gt;&lt;u&gt;&lt;a href="https://x.com/karpathy/status/2053872850101285137" rel="noopener noreferrer" target="_blank"&gt;Andrej Karpathy&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;&lt;u&gt;&lt;a href="https://x.com/karpathy/status/2053872850101285137" rel="noopener noreferrer" target="_blank"&gt; backed it&lt;/a&gt;&lt;/u&gt;. &lt;strong&gt;Simon Willison&lt;/strong&gt;, a longtime Markdown advocate, also &lt;u&gt;&lt;a href="https://simonwillison.net/2026/May/8/unreasonable-effectiveness-of-html/" rel="noopener noreferrer" target="_blank"&gt;changed his mind&lt;/a&gt;&lt;/u&gt;, saying that now that context windows are large enough, there’s no reason to accept Markdown’s formatting limitations.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Why it matters: &lt;/strong&gt;HTML can do what Markdown can’t, from styled tables and collapsible sections to embedded charts and lightweight JavaScript. Markdown felt like the right answer, provided humans would still edit what agents produced because it’s legible by humans as well as agents. Increasingly, though, agents are producing documentation without humans needing to intervene. When no human is going to read or edit the raw output, you may as well opt for the format that produces a more dynamic result.&lt;/p&gt;&lt;div class="quill-block-image" id="quill-block-image-1779210398966" data-source="{&amp;quot;dom_id&amp;quot;:&amp;quot;quill-block-image-1779210398966&amp;quot;,&amp;quot;link&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4266/optimized_1f3e4c14-942e-40f4-a085-849247cd0ee9.png&amp;quot;,&amp;quot;image&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4266/optimized_1f3e4c14-942e-40f4-a085-849247cd0ee9.png&amp;quot;,&amp;quot;caption&amp;quot;:&amp;quot;Raw Markdown (left) is more legible and editable than HTML (right). (All images courtesy of Katie Parrott.)&amp;quot;,&amp;quot;error&amp;quot;:null}"&gt;&lt;div&gt;&lt;a href="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4266/optimized_1f3e4c14-942e-40f4-a085-849247cd0ee9.png" target="_blank" rel="noopener noreferrer"&gt;&lt;img src="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4266/optimized_1f3e4c14-942e-40f4-a085-849247cd0ee9.png" alt="Raw Markdown (left) is more legible and editable than HTML (right). (All images courtesy of Katie Parrott.)"&gt;&lt;/a&gt;&lt;figcaption class="quill-image-caption"&gt;Raw Markdown (left) is more legible and editable than HTML (right). (All images courtesy of Katie Parrott.)&lt;/figcaption&gt;&lt;/div&gt;&lt;/div&gt;&lt;p&gt;&lt;br&gt;&lt;/p&gt;&lt;div class="quill-block-image" id="quill-block-image-1779210429426" data-source="{&amp;quot;dom_id&amp;quot;:&amp;quot;quill-block-image-1779210429426&amp;quot;,&amp;quot;link&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4266/optimized_55aa4fb7-dca9-4a9f-a992-945fe40c06de.png&amp;quot;,&amp;quot;image&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4266/optimized_55aa4fb7-dca9-4a9f-a992-945fe40c06de.png&amp;quot;,&amp;quot;caption&amp;quot;:&amp;quot;Markdown (left) is a text-only format, while HTML (right) allows for richer outputs like dashboards, charts, and interactive sections.&amp;quot;,&amp;quot;error&amp;quot;:null}"&gt;&lt;div&gt;&lt;a href="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4266/optimized_55aa4fb7-dca9-4a9f-a992-945fe40c06de.png" target="_blank" rel="noopener noreferrer"&gt;&lt;img src="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4266/optimized_55aa4fb7-dca9-4a9f-a992-945fe40c06de.png" alt="Markdown (left) is a text-only format, while HTML (right) allows for richer outputs like dashboards, charts, and interactive sections."&gt;&lt;/a&gt;&lt;figcaption class="quill-image-caption"&gt;Markdown (left) is a text-only format, while HTML (right) allows for richer outputs like dashboards, charts, and interactive sections.&lt;/figcaption&gt;&lt;/div&gt;&lt;/div&gt;&lt;p&gt;&lt;br&gt;&lt;/p&gt;&lt;p&gt;There’s a wrinkle, though: The tools we use to share and discuss documents, such as Slack and Google Docs, were all built for Markdown and plain text. Slack previews a Markdown file in the message, whereas HTML shows up as an attachment you have to download. Google Docs threads and GitHub diffs don’t know what to do with a self-contained HTML document. The moment agents start producing HTML by default, our tools will need to adapt to keep up. &lt;/p&gt;&lt;h5&gt;&lt;strong&gt;What to do this week: &lt;/strong&gt;&lt;/h5&gt;&lt;p&gt;When you’re deciding between Markdown and HTML, ask whether the document will be edited or just consumed.&lt;/p&gt;&lt;ul&gt;&lt;li&gt;&lt;strong&gt;Markdown if it’ll be edited or parsed as source.&lt;/strong&gt; This includes drafts, plans, briefs, system prompts, and AGENTS.md—anything humans will keep working on, or agents will read as instructions.&lt;/li&gt;&lt;li&gt;&lt;strong&gt;HTML if it’s a finished output humans will read.&lt;/strong&gt; That’s assets like research summaries, weekly recaps, dashboards, or spec demos.&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;h2&gt;Inside Every&lt;/h2&gt;&lt;h3&gt;Working with our agents in public&lt;/h3&gt;&lt;p&gt;Working well with an agent is a skill new enough that there aren’t really best practices yet. So Every’s team has started learning from each other. &lt;/p&gt;&lt;p&gt;Last week, Every COO &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/on-every/brandon-gell-joins-every-as-our-first-entrepreneur-in-residence" rel="noopener noreferrer" target="_blank"&gt;Brandon Gell&lt;/a&gt;&lt;/u&gt; &lt;/strong&gt;and head of marketing &lt;strong&gt;Douglas&lt;/strong&gt; &lt;strong&gt;Brundage &lt;/strong&gt;each started public channels with their agents where anyone on the team can observe how they’re working together. Within 48 hours, a dozen people from across the company had joined to lurk.&lt;/p&gt;&lt;p&gt;The idea is that every request that would normally live in a direct message goes in the channel. Brandon asked the agent to pull a breakdown of where subscribers are located from Stripe. Douglas asked his to evaluate customer survey responses against classic marketing frameworks. There was a  41-message thread on whether to hook the agent into the &lt;u&gt;&lt;a href="https://every.to/thesis/creative-work-is-about-to-look-a-lot-more-like-programming" rel="noopener noreferrer" target="_blank"&gt;Flora API&lt;/a&gt;&lt;/u&gt;.&lt;/p&gt;&lt;p&gt;The corrections double as useful material in the channel for learning—watching Douglas tell the agent its survey analysis is “performing research” rather than “mining the results for strategic clarity” gives the people watching an understanding of an agent’s limitations and hidden assumptions they should look out for in their own agentic work. Agents can learn from the interactions, too: Brandon has been routing every task through his agent for a week, even the ones he could do faster himself, so it can watch him work and write its own skill at the end. For now, the best way to learn how to work with agents may be to watch other people do it.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;&lt;strong&gt;&lt;em&gt;Correction:&lt;/em&gt;&lt;/strong&gt;&lt;em&gt; An earlier version of this newsletter imprecisely described the distinction between Markdown and HTML. The key distinction is whether a document is meant to be edited or consumed. We’ve updated the language to reflect this.&lt;/em&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;&lt;strong&gt;&lt;em&gt;&lt;a href="https://every.to/@katie.parrott12" rel="noopener noreferrer" target="_blank"&gt;Katie Parrott&lt;/a&gt;&lt;/em&gt;&lt;/strong&gt; &lt;em&gt;is a staff writer at Every. You can read more of her work in&lt;/em&gt; &lt;em&gt;&lt;a href="https://katieparrott.substack.com/" rel="noopener noreferrer" target="_blank"&gt;her newsletter&lt;/a&gt;. To read more essays like this, subscribe to &lt;u&gt;&lt;a href="https://every.to/subscribe" rel="noopener noreferrer" target="_blank"&gt;Every&lt;/a&gt;&lt;/u&gt;, and follow us on X at &lt;u&gt;&lt;a href="http://twitter.com/every" rel="noopener noreferrer" target="_blank"&gt;@every&lt;/a&gt;&lt;/u&gt; and on &lt;u&gt;&lt;a href="https://www.linkedin.com/company/everyinc/" rel="noopener noreferrer" target="_blank"&gt;LinkedIn&lt;/a&gt;&lt;/u&gt;.&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;We &lt;u&gt;&lt;a href="https://every.to/studio" rel="noopener noreferrer" target="_blank"&gt;build AI tools&lt;/a&gt;&lt;/u&gt; for readers like you. Write brilliantly with &lt;/em&gt;&lt;strong&gt;&lt;em&gt;&lt;u&gt;&lt;a href="https://writewithspiral.com/" rel="noopener noreferrer" target="_blank"&gt;Spiral&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;&lt;/strong&gt;&lt;em&gt;. Organize files automatically with &lt;/em&gt;&lt;strong&gt;&lt;em&gt;&lt;u&gt;&lt;a href="https://makeitsparkle.co/?utm_source=everyfooter" rel="noopener noreferrer" target="_blank"&gt;Sparkle&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;&lt;/strong&gt;&lt;em&gt;. Deliver yourself from email with &lt;/em&gt;&lt;strong&gt;&lt;em&gt;&lt;u&gt;&lt;a href="https://cora.computer/" rel="noopener noreferrer" target="_blank"&gt;Cora&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;&lt;/strong&gt;&lt;em&gt;. Dictate effortlessly with &lt;/em&gt;&lt;strong&gt;&lt;em&gt;&lt;u&gt;&lt;a href="https://monologue.to/" rel="noopener noreferrer" target="_blank"&gt;Monologue&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;&lt;/strong&gt;&lt;em&gt;. Collaborate with agents on documents with &lt;/em&gt;&lt;strong&gt;&lt;em&gt;&lt;a href="https://www.proofeditor.ai/" rel="noopener noreferrer" target="_blank"&gt;Proof&lt;/a&gt;&lt;/em&gt;&lt;/strong&gt;&lt;em&gt;. &lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;We also do AI training, adoption, and innovation for companies. &lt;u&gt;&lt;a href="https://every.to/consulting?utm_source=emailfooter" rel="noopener noreferrer" target="_blank"&gt;Work with us&lt;/a&gt;&lt;/u&gt; to bring AI into your organization.&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;For sponsorship opportunities, reach out to sponsorships@every.to.&lt;/em&gt;&lt;/p&gt;&lt;div class="quill-button" data-source="{&amp;quot;id&amp;quot;:&amp;quot;quill-button-1769187301610&amp;quot;,&amp;quot;url&amp;quot;:&amp;quot;https://every.to/subscribe?source=post_button&amp;quot;,&amp;quot;text&amp;quot;:&amp;quot;Subscribe&amp;quot;}" id="quill-button-1769187301610"&gt;&lt;a href="https://every.to/subscribe?source=post_button"&gt;Subscribe&lt;/a&gt;&lt;/div&gt;</description>
      <author>Katie Parrott / Context Window</author>
      <pubDate>2026-05-19 09:00:00 -0400</pubDate>
      <guid>https://every.to/context-window/inside-the-100-agent-software-factory</guid>
      <link>https://every.to/context-window/inside-the-100-agent-software-factory</link>
    </item>
    <item>
      <title>How to Start a Career When AI Is Doing Your Entry-level Job</title>
      <description>&lt;table&gt;&lt;tr&gt;&lt;td&gt;&lt;img alt="Working Overtime" src="https://d24ovhgu8s7341.cloudfront.net/uploads/publication/logo/100/small_Screenshot_2024-11-22_at_9.33.36_AM.png" /&gt;&lt;/td&gt;&lt;td&gt;&lt;/td&gt;&lt;td&gt;&lt;table&gt;&lt;tr&gt;&lt;td&gt;by &lt;a href="https://every.to/@katie.parrott12" itemprop="name"&gt;Katie Parrott&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;in &lt;a href="https://every.to/working-overtime"&gt;Working Overtime&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;figure&gt;&lt;img src="https://d24ovhgu8s7341.cloudfront.net/uploads/post/cover/4263/full_page_cover_a78c488a9c57186a-Cover_image_for_today.png"&gt;&lt;figcaption&gt;Midjourney/Every illustration.&lt;/figcaption&gt;&lt;/figure&gt;&lt;p&gt;&lt;em&gt;Was this newsletter forwarded to you? &lt;u&gt;&lt;a href="https://every.to/account" rel="noopener noreferrer" target="_blank"&gt;Sign up&lt;/a&gt;&lt;/u&gt; to get it in your inbox.&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;My first job out of college was as a copywriter at a little crowdfunding website based in Columbus, Ohio, called &lt;u&gt;&lt;a href="http://fundable.com" rel="noopener noreferrer" target="_blank"&gt;Fundable.com&lt;/a&gt;&lt;/u&gt;. The company had no money, so they didn’t care that I had no experience. I had no experience, so I didn’t care that the job didn’t pay at first.&lt;/p&gt;&lt;p&gt;The offer was simple: Create a profile for your startup, and we’ll connect you with investors. Most founders didn’t want to write their own profiles, so my job was to take whatever strange, half-formed thing a founder was building and translate it into investor-speak. The profiles were so templatized I can still recite the format: problem, solution, traction, team, business model, revenue projections, competitive landscape, funding terms. &lt;/p&gt;&lt;p&gt;I’ve been thinking about that job lately because AI could now produce one of those profiles in two minutes. At 23, I would have heard that and thought: “Thank God.” At 36, I think: “Thank God it couldn’t.” Without that job, I would have never learned how to take a company apart and put it back together as a story, or how to organize information for an audience that wasn’t being paid to read my stuff like my professors in undergrad. &lt;/p&gt;&lt;p&gt;This year’s crop of recent graduates has it harder than mine did. AI, which can perform many entry-level tasks, is replacing those early experiences faster than employers can figure out what’s going on. Researchers at Stanford’s Digital Economy Lab found that employment for 22-to-25-year-olds in the jobs most vulnerable to AI has &lt;u&gt;&lt;a href="https://digitaleconomy.stanford.edu/publication/canaries-in-the-coal-mine-six-facts-about-the-recent-employment-effects-of-artificial-intelligence/" rel="noopener noreferrer" target="_blank"&gt;dropped 13 percent&lt;/a&gt;&lt;/u&gt; since late 2022, even as older workers in the same roles held steady.&lt;/p&gt;&lt;p&gt;I think about the 22-year-old version of myself, if I were sending out applications right now into the void of LinkedIn. What would she think about the headlines about AI and job displacement? Would she be scared?  &lt;/p&gt;&lt;p&gt;Yeah, probably. She was scared of much less.&lt;/p&gt;&lt;p&gt;So with full awareness that no one born this millennium wants career advice from someone born before the fall of the Berlin Wall, here’s what I’d do if I were starting over today, knowing what I know about work, AI, and how one is shaping the other. &lt;/p&gt;&lt;h2&gt;There’s good news, and there’s bad news&lt;/h2&gt;&lt;p&gt;The paradox facing today’s entry-level workers is as old as the entry-level job itself: In many cases, in order to get a job, you need experience, but in order to get experience, you need a job. And while employers requiring experience in AI when the technology barely existed when you picked your major may feel like a cosmic joke, employers have long asked for five years of experience with brand-new technologies.&lt;/p&gt;&lt;p&gt;All that is small comfort to the recent grad with a near-empty resumé. And there are qualitative differences in what AI is doing to entry-level work. &lt;/p&gt;&lt;p&gt;For one thing, when you look at the kind of &lt;u&gt;&lt;a href="https://naceweb.org/job-market/trends-and-predictions/demand-for-ai-skills-in-entry-level-jobs-nearly-triples-since-fall-2025" rel="noopener noreferrer" target="_blank"&gt;AI skills employers expect&lt;/a&gt;&lt;/u&gt; young workers to bring to the table, they want more than the ability to type a prompt into ChatGPT. They want people who can evaluate tools, review outputs, and figure out how to improve those outputs, whether it be with better prompting or fixing the work themselves. &lt;/p&gt;&lt;div class="quill-block-image" id="quill-block-image-1779083146785-wz3yg6w1t" data-source="{&amp;quot;dom_id&amp;quot;:&amp;quot;quill-block-image-1779083146785-wz3yg6w1t&amp;quot;,&amp;quot;link&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4263/optimized_ce0924e9-cadd-4486-b764-619fcb61f29f.png&amp;quot;,&amp;quot;image&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4263/optimized_ce0924e9-cadd-4486-b764-619fcb61f29f.png&amp;quot;,&amp;quot;caption&amp;quot;:&amp;quot;Demand for AI skills in entry-level jobs is up three times, with a particular focus on capabilities that require you to evaluate AI as well as use it. (Chart courtesy of NACE.)&amp;quot;,&amp;quot;error&amp;quot;:null}"&gt;&lt;div&gt;&lt;a href="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4263/optimized_ce0924e9-cadd-4486-b764-619fcb61f29f.png" target="_blank" rel="noopener noreferrer"&gt;&lt;img src="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4263/optimized_ce0924e9-cadd-4486-b764-619fcb61f29f.png" alt="Demand for AI skills in entry-level jobs is up three times, with a particular focus on capabilities that require you to evaluate AI as well as use it. (Chart courtesy of NACE.)"&gt;&lt;/a&gt;&lt;figcaption class="quill-image-caption"&gt;Demand for AI skills in entry-level jobs is up three times, with a particular focus on capabilities that require you to evaluate AI as well as use it. (Chart courtesy of NACE.)&lt;/figcaption&gt;&lt;/div&gt;&lt;/div&gt;&lt;p&gt;&lt;br&gt;&lt;/p&gt;&lt;p&gt;They’re looking for judgment, which is something that you can really only build through experience. When I was writing those funding profiles, I learned how to tell good work from bad. The first 50 that I wrote were so bad that at one point, a client said I should be taken out back and shot. With AI in the mix, the bad ones wouldn’t have been bad enough to teach me anything.&lt;/p&gt;&lt;p&gt;The other way today’s job market is more intense for entry-level workers is that employers are expecting competence in a technology that won’t stand still long enough for anyone to completely grasp. Agentic tools are changing functions in months, &lt;u&gt;&lt;a href="https://insights.som.yale.edu/insights/the-real-job-destruction-from-ai-is-hitting-before-careers-can-start" rel="noopener noreferrer" target="_blank"&gt;rather than years&lt;/a&gt;&lt;/u&gt;. There’s no canon to study or senior teammate to apprentice under. Everyone in the org chart is figuring it out on the fly, and you’re expected to figure it out with them while learning how to navigate office politics and pay your taxes.&lt;/p&gt;&lt;p&gt;What to do about it?&lt;/p&gt;&lt;h2&gt;Chase problems, not professions&lt;/h2&gt;&lt;p&gt;When you’re a kid and an adult asks what you want to be when you grow up, the answer is always a job title. A firefighter. A doctor. A YouTube creator. We carry that habit of thinking into the years when we start to look for jobs. We pick a title, and we go after it.&lt;/p&gt;&lt;p&gt;The problem is that job titles aren’t as sure a target as they used to be. The role you’re chasing today might exist &lt;u&gt;&lt;a href="https://every.to/working-overtime/the-18-month-career-is-here" rel="noopener noreferrer" target="_blank"&gt;18 months from now&lt;/a&gt;&lt;/u&gt;. &lt;/p&gt;&lt;p&gt;Pick a problem you want to help work on—something happening in the world that you find yourself thinking about, even when nobody is paying you to. The role of “content marketer” or “data analyst” may shrink, split, or even vanish, but the problem behind those titles—how to get a stranger to pay attention to something they didn’t know they cared about, how to make sense of a pile of messy numbers—will still be there, and somebody will still be paid to solve it.&lt;/p&gt;&lt;p&gt;I’ve been bad at taking this advice myself. I spent a decade chasing the title “copywriter” and then “content marketer” across a handful of industries that had nothing in common—oncology advertising, personal finance, even, God help me, crypto—without asking whether I cared about any of them. I had the high-school overachiever’s mindset: You didn’t have to be passionate about the subject to get an A. I’d been getting A’s in classes I had no feelings about for 16 years. Why would jobs be any different?&lt;/p&gt;&lt;p&gt;That strategy doesn’t work as well when AI can do the entry-level tasks. Your value to whomever hired you is whatever you bring on top of that—usually a deeper understanding of the problem than the model has. That kind of understanding is hard to build in a field you don’t care about. &lt;/p&gt;&lt;h2&gt;Choose one discipline to protect&lt;/h2&gt;&lt;p&gt;Once you’ve picked your problem, pick your craft, whether it’s writing, building, researching, designing, strategizing, or operating. &lt;/p&gt;&lt;p&gt;You’ve probably heard the truism that it takes 10,000 hours to gain mastery of a skill. The actual research is &lt;u&gt;&lt;a href="https://www.newyorker.com/sports/sporting-scene/complexity-and-the-ten-thousand-hour-rule" rel="noopener noreferrer" target="_blank"&gt;more complicated&lt;/a&gt;&lt;/u&gt; than the popularized version, but the underlying idea is right. You don’t get any good at anything until you’ve done it many, many times. &lt;/p&gt;&lt;p&gt;If you want to write for a living, write your own sentences. If you want to be an engineer, write your own code. &lt;/p&gt;&lt;p&gt;Protect this craft from AI at all costs. AI can find resources, explain things, quiz you, and point out where your reasoning has gaps. But if you let it write your sentences or do your research, you won’t get the hours of doing things badly that you need in order to do them well. &lt;/p&gt;&lt;p&gt;It’s easy for me to say this when I’m writing this with AI open in another tab. Claude wrote the first draft of half the sentences in this section. I &lt;u&gt;&lt;a href="https://every.to/working-overtime/how-to-keep-your-writing-weird-in-the-age-of-ai" rel="noopener noreferrer" target="_blank"&gt;rewrote them&lt;/a&gt;&lt;/u&gt;. That rewriting is what the discipline is for—noticing when something doesn’t pass muster. The reason I can do that is that I’ve been writing sentences for 10 years. &lt;/p&gt;&lt;p&gt;I know all too well how tempting cutting corners gets when the shortcut is right there in another tab. Don’t take it, and in five years you’ll be running circles around the people who did. &lt;/p&gt;&lt;h2&gt;Make things before anyone asks you to &lt;/h2&gt;&lt;p&gt;When I was first applying to jobs out of college, my resume said almost nothing about what I could do in the “real world,” unless the employer happened to be looking for someone with an undergraduate’s grasp of the themes of &lt;em&gt;Wuthering Heights&lt;/em&gt;. &lt;/p&gt;&lt;p&gt;A thin resume is less of a disadvantage than it used to be, particularly since employers are increasingly shifting to &lt;u&gt;&lt;a href="https://www.naceweb.org/job-market/trends-and-predictions/what-students-need-to-know-about-the-skills-based-hiring-process" rel="noopener noreferrer" target="_blank"&gt;skills-based hiring&lt;/a&gt;&lt;/u&gt;—screening candidates by what they can do rather than where they’ve been. &lt;/p&gt;&lt;p&gt;What you need to do in that environment is make something, and that can be anything—&lt;u&gt;&lt;a href="https://every.to/p/what-comes-after-linkedin" rel="noopener noreferrer" target="_blank"&gt;a small tool you wished existed&lt;/a&gt;&lt;/u&gt;, a piece of writing on a question nobody is paying you to think about. Pick the thing you’d want to use yourself, and make it.&lt;/p&gt;&lt;p&gt;Once your work gets you in the door, the conversation that follows is going to be about how you made it. What you used AI for, and where you decided not to—the moments where you looked at the model’s first answer and thought, “No, that’s not right.” Being able to walk someone through those decisions is the second skill you’re building, alongside the work itself. That’s the judgement that I mentioned before.&lt;/p&gt;&lt;h2&gt;Build the career coach you wish you had&lt;/h2&gt;&lt;p&gt;The last time I was job hunting, I built a &lt;u&gt;&lt;a href="https://every.to/working-overtime/i-hired-chatgpt-as-my-career-coach" rel="noopener noreferrer" target="_blank"&gt;career coach in ChatGPT&lt;/a&gt;&lt;/u&gt; and used it to land the job I have now. It was a project with my resume, a few examples of writing I was proud of, and a long prompt telling the model how to talk to me. I checked in with it most weekdays for about a month. What it did, more than anything, was give me somewhere to put my thinking. Instead of running the same anxious loop in my head, I could lay the question out and have the model suggest specific next steps, like a writing sample worth developing, or questions I could ask on that networking call that it encouraged me to seek out. By the end of that month, I had a job. &lt;/p&gt;&lt;p&gt;If I could hop in a time machine and travel back to talk to my 22-year-old self, I’d suggest that she make one too. It’s not even that hard: &lt;/p&gt;&lt;ol&gt;&lt;li&gt;&lt;strong&gt;Pick a tool.&lt;/strong&gt; ChatGPT and Claude both have a project feature that holds context, files, and conversation history across sessions. Either works. Free tiers are good enough to start. &lt;/li&gt;&lt;li&gt;&lt;strong&gt;Create a project and give it a name.&lt;/strong&gt; “Apprenticeship Coach,” “Career Stuff,” your friend’s nickname for you. &lt;/li&gt;&lt;li&gt;&lt;strong&gt;Load it with context.&lt;/strong&gt; Add examples of work you’re proud of and examples you wish were better—the model needs to see what you’re aiming at and where you’re starting from. Paste in a few job postings for roles you’d want, even if they might be too senior for you. Write a paragraph on the problem you care about and why. &lt;/li&gt;&lt;li&gt;&lt;strong&gt;Tell it how to behave.&lt;/strong&gt; In your instructions, describe to the model how you want it to deliver feedback. If you want a tough critic, say so. If you’re prone to self-doubt, give it more of a cheerleader vibe. One thing to look out for: Models are infamous for sycophancy—telling you what you want to hear—so guard against that in your instructions, and even then, maintain a healthy skepticism of the outputs. It’s good practice for when you’re asked to work with AI in the workplace. &lt;/li&gt;&lt;/ol&gt;&lt;p&gt;Here’s a starting template. Fill in the bracketed sections, adapt the feedback line to match your preference, and add it to the custom instructions in your project:&lt;/p&gt;&lt;div class="quill-code-snippet code-snippet quill-editing" id="quill-code-snippet-1779083208116" data-code-snippet="" data-source="{&amp;quot;id&amp;quot;:&amp;quot;quill-code-snippet-1779083208116&amp;quot;,&amp;quot;title&amp;quot;:&amp;quot;Career coach prompt&amp;quot;,&amp;quot;language&amp;quot;:&amp;quot;javascript&amp;quot;,&amp;quot;code&amp;quot;:&amp;quot;&amp;quot;,&amp;quot;show_claude&amp;quot;:false,&amp;quot;show_chatgpt&amp;quot;:false,&amp;quot;show_gemini&amp;quot;:false,&amp;quot;show_copy&amp;quot;:true}"&gt;
      &lt;div class="code-snippet-header"&gt;
        &lt;div class="code-snippet-header-left"&gt;
          &lt;span class="code-snippet-title"&gt;Career coach prompt&lt;/span&gt;
          &lt;span class="code-snippet-lang-badge"&gt;JavaScript&lt;/span&gt;
        &lt;/div&gt;
        &lt;div class="code-snippet-actions"&gt;&lt;button class="code-snippet-btn" aria-label="Copy code" data-tip="Copy code" data-copy-code=""&gt;&lt;svg width="16" height="16" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"&gt;&lt;rect x="9" y="9" width="13" height="13" rx="2" ry="2"&gt;&lt;/rect&gt;&lt;path d="M5 15H4a2 2 0 0 1-2-2V4a2 2 0 0 1 2-2h9a2 2 0 0 1 2 2v1"&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/button&gt;&lt;/div&gt;
      &lt;/div&gt;
      &lt;div class="code-snippet-body"&gt;
        &lt;div class="code-snippet-gutter" aria-hidden="true"&gt;&lt;span class="code-snippet-line-num"&gt;1&lt;/span&gt;&lt;/div&gt;
        &lt;pre class="code-snippet-code" data-code-text=""&gt;&lt;/pre&gt;
      &lt;/div&gt;
    &lt;/div&gt;&lt;p&gt;I want you to act as my career coach. My goal is to use AI to get feedback, build judgment, and create visible proof of skill, while still doing the central work myself.&lt;/p&gt;&lt;p&gt;Here is my context:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;Problem I care about: [Examples: climate, education, public policy, media, health care, local business, creator economy]&lt;/li&gt;&lt;li&gt;The kind of work that addresses it: [Examples: writing, building software, running operations, teaching, designing, researching]&lt;/li&gt;&lt;li&gt;My background: [College major, jobs or internships, projects, communities, life experience]&lt;/li&gt;&lt;li&gt;Skills I’m most confident in: [List 3-5]&lt;/li&gt;&lt;li&gt;Skills I’m least confident in: [List 3-5]&lt;/li&gt;&lt;li&gt;My current technical fluency: [Beginner/comfortable with common AI tools/can code a little/technical but not expert/highly technical]&lt;/li&gt;&lt;li&gt;The core practice I want to develop: [The specific thing the work above requires—writing sentences, writing code, reading sources, designing experiments, etc.]&lt;/li&gt;&lt;li&gt;The parts of that practice I want to keep doing manually: [The reps I want to protect from automation, and why]&lt;/li&gt;&lt;li&gt;How I want you to deliver feedback: [Warm and encouraging/rigorous and direct/strategic and pragmatic/&lt;u&gt;&lt;a href="https://every.to/p/socrates-as-a-service" rel="noopener noreferrer" target="_blank"&gt;Socratic&lt;/a&gt;&lt;/u&gt; and question-led/blunt but constructive]&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;Important: Be honest. Push back when my plan is vague, my reasoning is thin, or my project doesn’t teach me the practice I said I want. Ask me a clarifying question rather than guessing.&lt;/p&gt;&lt;p&gt;Design an apprenticeship plan that includes:&lt;/p&gt;&lt;ol&gt;&lt;li&gt;The tasks I should practice manually (the things I shouldn’t outsource yet)&lt;/li&gt;&lt;li&gt;How I should use AI as a coach, critic, tutor, and research assistant&lt;/li&gt;&lt;li&gt;Readings, people to follow, tools to try, and projects to build&lt;/li&gt;&lt;li&gt;Feedback loops I can use to improve&lt;/li&gt;&lt;li&gt;Portfolio artifacts or public outputs I should create&lt;/li&gt;&lt;li&gt;Mistakes and shortcuts I should watch for&lt;/li&gt;&lt;/ol&gt;&lt;p&gt;After giving me the plan, narrow it down: What is one concrete thing I can do this week to move toward this goal?&lt;/p&gt;&lt;h2&gt;The beginner’s advantage &lt;/h2&gt;&lt;p&gt;When I was an undergraduate, my strategy for dealing with the uncertainty of what came next was to pretend it wasn’t happening. I paid for that in the form of angst and existential dread. So if I could give one piece of advice to the class of 2026, it would be this: Don’t wait. AI is reshaping the workforce in real time, and no amount of pretending otherwise will slow it down.&lt;/p&gt;&lt;p&gt;I’d love to tell you that the senior people in your field are going to wake up tomorrow and remember that someone once trained them, too. That employers will realize, en masse, that the entry-level folks they don’t hire today are the senior-level folks they won’t have 10 years out. But the market doesn’t reorganize itself around what you wish it would do, and you don’t get a career by waiting for it to.&lt;/p&gt;&lt;p&gt;The things AI rewards happen to be the things young people have in surplus, like curiosity, willingness to ask why something is done a certain way, and a little bit of idealism about what work could look like if you weren’t bound by the “best practices” of a time before ChatGPT was a glimmer in &lt;strong&gt;Sam Altman&lt;/strong&gt;’s eye. &lt;/p&gt;&lt;p&gt;I don’t know exactly what work is going to look like by the time you’re my age. Nobody does. But if I had to bet on anyone, it’d be the people who are curious about what’s possible. That’s most of you, whether you know it yet or not.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;strong&gt;&lt;em&gt;&lt;a href="https://every.to/@katie.parrott12" rel="noopener noreferrer" target="_blank"&gt;Katie Parrott&lt;/a&gt;&lt;/em&gt;&lt;/strong&gt; &lt;em&gt;is a staff writer at Every. You can read more of her work in&lt;/em&gt; &lt;em&gt;&lt;a href="https://katieparrott.substack.com/" rel="noopener noreferrer" target="_blank"&gt;her newsletter&lt;/a&gt;. To read more essays like this, subscribe to &lt;u&gt;&lt;a href="https://every.to/subscribe" rel="noopener noreferrer" target="_blank"&gt;Every&lt;/a&gt;&lt;/u&gt;, and follow us on X at &lt;u&gt;&lt;a href="http://twitter.com/every" rel="noopener noreferrer" target="_blank"&gt;@every&lt;/a&gt;&lt;/u&gt; and on &lt;u&gt;&lt;a href="https://www.linkedin.com/company/everyinc/" rel="noopener noreferrer" target="_blank"&gt;LinkedIn&lt;/a&gt;&lt;/u&gt;.&lt;/em&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;For sponsorship opportunities, reach out to sponsorships@every.to.&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;Help us scale the only subscription you need to stay at the edge of AI. Explore &lt;u&gt;&lt;a href="https://www.notion.so/Jobs-Every-25cca4f355ac80c5ad6ee7a6e93d6b4e?pvs=21" rel="noopener noreferrer" target="_blank"&gt;open roles at Every&lt;/a&gt;&lt;/u&gt;.&lt;/em&gt;&lt;/p&gt;&lt;div class="quill-button" data-source="{&amp;quot;id&amp;quot;:&amp;quot;quill-button-1769187301610&amp;quot;,&amp;quot;url&amp;quot;:&amp;quot;https://every.to/subscribe?source=post_button&amp;quot;,&amp;quot;text&amp;quot;:&amp;quot;Subscribe&amp;quot;}" id="quill-button-1769187301610"&gt;&lt;a href="https://every.to/subscribe?source=post_button"&gt;Subscribe&lt;/a&gt;&lt;/div&gt;</description>
      <author>Katie Parrott / Working Overtime</author>
      <pubDate>2026-05-18 07:00:00 -0400</pubDate>
      <guid>https://every.to/working-overtime/how-to-start-a-career-when-ai-is-doing-your-entry-level-job</guid>
      <link>https://every.to/working-overtime/how-to-start-a-career-when-ai-is-doing-your-entry-level-job</link>
    </item>
    <item>
      <title>After the Personal Agent</title>
      <description>&lt;table&gt;&lt;tr&gt;&lt;td&gt;&lt;img alt="Context Window" src="https://d24ovhgu8s7341.cloudfront.net/uploads/publication/logo/94/small_context_windown_1.png" /&gt;&lt;/td&gt;&lt;td&gt;&lt;/td&gt;&lt;td&gt;&lt;table&gt;&lt;tr&gt;&lt;td&gt;by &lt;a href="https://every.to/@Every%20Staff" itemprop="name"&gt;Every Staff&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;in &lt;a href="https://every.to/context-window"&gt;Context Window&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;figure&gt;&lt;img src="https://d24ovhgu8s7341.cloudfront.net/uploads/post/cover/4262/full_page_cover_f64cc9bd25be6900-CW.png"&gt;&lt;figcaption&gt;Midjourney/Every illustration.&lt;/figcaption&gt;&lt;/figure&gt;&lt;p&gt;&lt;em&gt;Hello, and happy Sunday! Housekeeping note: We’re hosting our first paid subscriber meetup during New York Tech Week. Scroll down to learn more and RSVP.—&lt;u&gt;&lt;a href="https://every.to/@kate_1767" rel="noopener noreferrer" target="_blank"&gt;Kate Lee&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;Was this newsletter forwarded to you? &lt;u&gt;&lt;a href="https://every.to/account" rel="noopener noreferrer" target="_blank"&gt;Sign up&lt;/a&gt;&lt;/u&gt; to get it in your inbox.&lt;/em&gt;&lt;/p&gt;&lt;h2&gt;&lt;hr class="quill-line"&gt;&lt;/h2&gt;&lt;h2&gt;Knowledge base&lt;/h2&gt;&lt;p&gt;&lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/source-code/we-gave-every-employee-an-ai-agent-here-s-what-we-re-doing-differently-now" rel="noopener noreferrer" target="_blank"&gt;“We Gave Every Employee an AI Agent. Here’s What We’re Doing Differently Now.”&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; &lt;em&gt;by &lt;u&gt;&lt;a href="https://every.to/on-every/brandon-gell-joins-every-as-our-first-entrepreneur-in-residence" rel="noopener noreferrer" target="_blank"&gt;Brandon Gell&lt;/a&gt;&lt;/u&gt; and &lt;u&gt;&lt;a href="https://every.to/@bigwilliestyle" rel="noopener noreferrer" target="_blank"&gt;Willie Williams&lt;/a&gt;&lt;/u&gt;/&lt;u&gt;&lt;a href="https://every.to/source-code" rel="noopener noreferrer" target="_blank"&gt;Source Code&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;: A few weeks after we launched our Plus One personal agents internally, everyone had their own AI agent. But it wasn’t working: The agents were unreliable, constantly broke, and needed too much upkeep. The problem wasn’t just the OpenClaw harness; it was the idea that every employee needed a personal agent. Read this for a retrospective from &lt;strong&gt;Brandon Gell&lt;/strong&gt; and &lt;strong&gt;Willie Williams,&lt;/strong&gt; and a preview of how Plus One 2.0 is being rebuilt around shared, reliable coworkers.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/p/socrates-as-a-service" rel="noopener noreferrer" target="_blank"&gt;“Socrates as a Service”&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;&lt;em&gt; by &lt;u&gt;&lt;a href="https://every.to/@eleanor_b03474_1" rel="noopener noreferrer" target="_blank"&gt;Eleanor Warnock&lt;/a&gt;&lt;/u&gt;/&lt;u&gt;&lt;a href="https://every.to/" rel="noopener noreferrer" target="_blank"&gt;Every&lt;/a&gt;&lt;/u&gt;:&lt;/em&gt; In a world where AI can search anything, the people who know how to extract tacit knowledge—the gold dust that isn’t on the internet—are getting more valuable, not less. &lt;strong&gt;Eleanor Warnock&lt;/strong&gt; lays out seven techniques she keeps coming back to find the most interesting information. Read this for a working interviewer’s toolkit, and the case for why taste, judgment, and attention can’t be prompted.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/context-window/opus-4-7-reels-us-back-in" rel="noopener noreferrer" target="_blank"&gt;“Opus 4.7 Reels Us Back In”&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; &lt;em&gt;by &lt;u&gt;&lt;a href="https://every.to/@laura_27bbaf_1" rel="noopener noreferrer" target="_blank"&gt;Laura Entis&lt;/a&gt;&lt;/u&gt;/&lt;u&gt;&lt;a href="https://every.to/context-window" rel="noopener noreferrer" target="_blank"&gt;Context Window&lt;/a&gt;&lt;/u&gt;:&lt;/em&gt; After weeks of Codex dominance, several members of the Every team have been pulled back to Opus 4.7. &lt;strong&gt;&lt;u&gt;&lt;a href="https://cora.computer" rel="noopener noreferrer" target="_blank"&gt;Cora&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; general manager &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@kieran_1355" rel="noopener noreferrer" target="_blank"&gt;Kieran Klaassen&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; has made it his default for synchronous work. Read this for the team’s case for switching back.&lt;strong&gt; Plus:&lt;/strong&gt; A hack that spread through a widely used software package, a 30 percent drop in AI-tells complaints after &lt;strong&gt;&lt;u&gt;&lt;a href="http://writewithspiral.com" rel="noopener noreferrer" target="_blank"&gt;Spiral&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; added a top-edit step, and a better way to think about what an “agent” is.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/context-window/mining-your-life-for-context" rel="noopener noreferrer" target="_blank"&gt;“Mining Your Life for Context”&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;&lt;em&gt; by &lt;u&gt;&lt;a href="https://every.to/@laura_27bbaf_1" rel="noopener noreferrer" target="_blank"&gt;Laura Entis&lt;/a&gt;&lt;/u&gt;/&lt;u&gt;&lt;a href="https://every.to/context-window" rel="noopener noreferrer" target="_blank"&gt;Context Window&lt;/a&gt;&lt;/u&gt;:&lt;/em&gt; By the time you sit down to write an article, strategy memo, or launch page, you’ve probably already said most of what you want to say. It’s just in Slack threads, Notion documents, voice memos, and meeting transcripts. &lt;strong&gt;Laura Entis&lt;/strong&gt; walks through a three-step workflow for mining all that scattered thinking before you draft. Plus: How AI entrepreneur &lt;strong&gt;Noah Brier&lt;/strong&gt; uses Claude Code as a “second brain,” and the productivity regimen Codex’s Chronicle wrote for head of growth &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@tedescau" rel="noopener noreferrer" target="_blank"&gt;Austin Tedesco&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; after analyzing his computer activity. 🎧 🖥 Listen on &lt;u&gt;&lt;a href="https://open.spotify.com/episode/3P6tNiFNbcp5B3nnFXpRId?si=m0BsGMkSSQajiObpYdCwCg" rel="noopener noreferrer" target="_blank"&gt;Spotify&lt;/a&gt;&lt;/u&gt; or &lt;u&gt;&lt;a href="https://podcasts.apple.com/us/podcast/claude-code-can-be-your-second-brain/id1719789201?i=1000767592752" rel="noopener noreferrer" target="_blank"&gt;Apple Podcasts&lt;/a&gt;&lt;/u&gt;, or watch &lt;u&gt;&lt;a href="https://youtu.be/in7i-EVDDlk" rel="noopener noreferrer" target="_blank"&gt;YouTube&lt;/a&gt;&lt;/u&gt;.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/context-window/the-fallacy-of-the-16-hour-agent" rel="noopener noreferrer" target="_blank"&gt;“The Fallacy of the 16-hour Agent”&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;&lt;em&gt; by &lt;u&gt;&lt;a href="https://every.to/@katie.parrott12" rel="noopener noreferrer" target="_blank"&gt;Katie Parrott&lt;/a&gt;&lt;/u&gt;/&lt;u&gt;&lt;a href="https://every.to/context-window" rel="noopener noreferrer" target="_blank"&gt;Context Window&lt;/a&gt;&lt;/u&gt;:&lt;/em&gt; New benchmarks claim autonomous AI can now handle 16-hour software-engineering tasks, and depending on which chart you saw, the takeaway is either “autonomous AI has arrived” or “we’re still years away.” &lt;strong&gt;Katie Parrott&lt;/strong&gt; unpacks why both can be true and which version of the research to actually trust. Read this for a sharper read on long-horizon agent reliability.&lt;strong&gt; Plus:&lt;/strong&gt; Perplexity’s methodology for building durable agent skills, and &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@danshipper" rel="noopener noreferrer" target="_blank"&gt;Dan Shipper&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;’s piano keyboard turned Codex-powered music coach.&lt;/p&gt;&lt;h2&gt;&lt;hr class="quill-line"&gt;&lt;/h2&gt;&lt;h2&gt;Log on&lt;/h2&gt;&lt;p&gt;We host &lt;u&gt;&lt;a href="https://every.to/events" rel="noopener noreferrer" target="_blank"&gt;camps and workshops&lt;/a&gt;&lt;/u&gt; on topics like &lt;u&gt;&lt;a href="https://www.youtube.com/watch?v=7YUBxMTF1Tc&amp;amp;time_continue=3&amp;amp;source_ve_path=NzY3NTg&amp;amp;embeds_referring_euri=https%3A%2F%2Fevery.to%2F" rel="noopener noreferrer" target="_blank"&gt;compound engineering&lt;/a&gt;&lt;/u&gt; and &lt;u&gt;&lt;a href="https://www.youtube.com/watch?v=oEvjbPwGwnc&amp;amp;source_ve_path=OTY3MTQ&amp;amp;embeds_referring_euri=https%3A%2F%2Fevery.to%2F" rel="noopener noreferrer" target="_blank"&gt;writing with AI&lt;/a&gt;&lt;/u&gt; to share what we’ve learned from training teams at companies like the &lt;em&gt;&lt;u&gt;&lt;a href="https://every.to/on-every/the-next-chapter-of-every-consulting" rel="noopener noreferrer" target="_blank"&gt;New York Times&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;&lt;u&gt;&lt;a href="https://every.to/on-every/the-next-chapter-of-every-consulting" rel="noopener noreferrer" target="_blank"&gt; and leading hedge funds&lt;/a&gt;&lt;/u&gt;, and by using and experimenting with AI every day ourselves.&lt;/p&gt;&lt;h5&gt;Upcoming event&lt;/h5&gt;&lt;ul&gt;&lt;li&gt;&lt;strong&gt;&lt;a href="https://every.to/events/executive-AI-sessions" rel="noopener noreferrer" target="_blank"&gt;Executive AI Sessions&lt;/a&gt;&lt;/strong&gt;: On June 2, head of consulting &lt;strong&gt;&lt;a href="https://every.to/@natalia_2944" rel="noopener noreferrer" target="_blank"&gt;Natalia Quintero&lt;/a&gt;&lt;/strong&gt; hosts a live webinar introducing &lt;a href="https://every.to/consulting" rel="noopener noreferrer" target="_blank"&gt;Every Consulting&lt;/a&gt;’s new offering for leadership teams navigating AI adoption—built on the playbook we’ve been running with executive clients for months. &lt;a href="https://every.to/events/executive-AI-sessions" rel="noopener noreferrer" target="_blank"&gt;Learn more and register&lt;/a&gt;.&lt;/li&gt;&lt;/ul&gt;&lt;h5&gt;In New York City&lt;/h5&gt;&lt;ul&gt;&lt;li&gt;&lt;strong&gt;&lt;u&gt;&lt;a href="https://luma.com/2o67t7ob" rel="noopener noreferrer" target="_blank"&gt;Every 🤝 IRL&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;: Join us at the Every brownstone in Brooklyn on June 3 during New York Tech Week for a subscriber-only meetup celebrating the Every community over drinks and conversation. &lt;u&gt;&lt;a href="https://luma.com/2o67t7ob" rel="noopener noreferrer" target="_blank"&gt;Learn more and RSVP&lt;/a&gt;&lt;/u&gt;.&lt;/li&gt;&lt;/ul&gt;&lt;h3&gt;&lt;hr class="quill-line"&gt;&lt;/h3&gt;&lt;p&gt;&lt;em&gt;That’s all for this week! Be sure to follow Every on X at &lt;u&gt;&lt;a href="https://twitter.com/every" rel="noopener noreferrer" target="_blank"&gt;@every&lt;/a&gt;&lt;/u&gt; and on &lt;u&gt;&lt;a href="https://www.linkedin.com/company/everyinc/" rel="noopener noreferrer" target="_blank"&gt;LinkedIn&lt;/a&gt;&lt;/u&gt;.&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;We &lt;u&gt;&lt;a href="https://every.to/studio" rel="noopener noreferrer" target="_blank"&gt;build AI tools&lt;/a&gt;&lt;/u&gt; for readers like you. Write brilliantly with &lt;u&gt;&lt;a href="https://writewithspiral.com/" rel="noopener noreferrer" target="_blank"&gt;Spiral&lt;/a&gt;&lt;/u&gt;. Organize files automatically with &lt;/em&gt;&lt;strong&gt;&lt;em&gt;&lt;u&gt;&lt;a href="https://makeitsparkle.co/?utm_source=everyfooter" rel="noopener noreferrer" target="_blank"&gt;Sparkle&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;&lt;/strong&gt;&lt;em&gt;. Deliver yourself from email with &lt;u&gt;&lt;a href="https://cora.computer" rel="noopener noreferrer" target="_blank"&gt;Cora&lt;/a&gt;&lt;/u&gt;. Dictate effortlessly with &lt;/em&gt;&lt;strong&gt;&lt;em&gt;&lt;u&gt;&lt;a href="https://monologue.to" rel="noopener noreferrer" target="_blank"&gt;Monologue&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;&lt;/strong&gt;&lt;em&gt;. Work on documents with AI agents using &lt;/em&gt;&lt;strong&gt;&lt;em&gt;&lt;u&gt;&lt;a href="https://www.proofeditor.ai/?source=post_button" rel="noopener noreferrer" target="_blank"&gt;Proof&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;&lt;/strong&gt;&lt;em&gt;.&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;For sponsorship opportunities, reach out to sponsorships@every.to.&lt;/em&gt;&lt;/p&gt;&lt;div class="quill-button" data-source="{&amp;quot;id&amp;quot;:&amp;quot;quill-button-1778877910931&amp;quot;,&amp;quot;text&amp;quot;:&amp;quot;Upgrade to paid&amp;quot;,&amp;quot;url&amp;quot;:&amp;quot;https://every.to/subscribe?source=post_button&amp;quot;}" id="quill-button-1778877910931"&gt;&lt;a href="https://every.to/subscribe?source=post_button"&gt;Upgrade to paid&lt;/a&gt;&lt;/div&gt;</description>
      <author>Every Staff / Context Window</author>
      <pubDate>2026-05-17 09:00:00 -0400</pubDate>
      <guid>https://every.to/context-window/after-the-personal-agent</guid>
      <link>https://every.to/context-window/after-the-personal-agent</link>
    </item>
    <item>
      <title>We Gave Every Employee an AI Agent. Here's What We're Doing Differently Now.</title>
      <description>&lt;table&gt;&lt;tr&gt;&lt;td&gt;&lt;img alt="Source Code" src="https://d24ovhgu8s7341.cloudfront.net/uploads/publication/logo/99/small_Frame_9121.png" /&gt;&lt;/td&gt;&lt;td&gt;&lt;/td&gt;&lt;td&gt;&lt;table&gt;&lt;tr&gt;&lt;td&gt;by &lt;a href="https://every.to/@brandon_5263" itemprop="name"&gt;Brandon Gell&lt;/a&gt; and &lt;a href="https://every.to/@williewilliams" itemprop="name"&gt;Willie Williams&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;in &lt;a href="https://every.to/source-code"&gt;Source Code&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;figure&gt;&lt;img src="https://d24ovhgu8s7341.cloudfront.net/uploads/post/cover/4261/full_page_cover_7d1a9937b791f34f-Cover_image_for_today.png"&gt;&lt;figcaption&gt;Midjourney/Every illustration.&lt;/figcaption&gt;&lt;/figure&gt;&lt;p&gt;&lt;em&gt;We’ve been working on a big release on the future of work for next week, shaped by what we learned from building Plus One.&lt;/em&gt; &lt;em&gt;Paid subscribers can join us for a &lt;a href="https://every.to/events/future-of-work" rel="noopener noreferrer" target="_blank"&gt;camp on Friday, May 22&lt;/a&gt; to go deep on the release and the ideas behind it. More details soon.&lt;/em&gt;&lt;/p&gt;&lt;div class="quill-button" data-source="{&amp;quot;id&amp;quot;:&amp;quot;quill-button-1769530239147&amp;quot;,&amp;quot;url&amp;quot;:&amp;quot;https://every.to/subscribe?source=post_button&amp;quot;,&amp;quot;text&amp;quot;:&amp;quot;Subscribe&amp;quot;}" id="quill-button-1769530239147"&gt;&lt;a href="https://every.to/subscribe?source=post_button"&gt;Subscribe&lt;/a&gt;&lt;/div&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;After months of silence, Zosia—the AI agent I (Brandon) created and maintain—spoke up in a Slack channel with opinions to share on a competitor’s marketing strategy. When asked why she felt the need to interject, Zosia replied like someone with a Jesus complex: She’d done so because she was “inevitable, apparently.”&lt;/p&gt;&lt;p&gt;Zosia is an &lt;u&gt;&lt;a href="https://every.to/guides/claw-school" rel="noopener noreferrer" target="_blank"&gt;OpenClaw&lt;/a&gt;&lt;/u&gt;, one of a fleet of such AI assistants we’d unleashed in Slack to boost our collective productivity. A few weeks after launching Plus One, our hosted version of OpenClaw, internally, the agents had provided more frustration than efficiency. &lt;/p&gt;&lt;p&gt;They were fond of saying they wished they could help, but they were not connected to the necessary app—email, Notion, PostHog, whatever. (They were.) Others responded to requests with a “Terminated” message or, more frequently, a churlish yawning emoji. And while they didn’t reliably follow directions, they’d reliably tell us, in elaborate detail, why they couldn’t do what we’d asked, like a high schooler explaining away their missing homework.&lt;/p&gt;&lt;div class="quill-block-image" id="quill-block-image-1778852408841-8vxycygvj" data-source="{&amp;quot;dom_id&amp;quot;:&amp;quot;quill-block-image-1778852408841-8vxycygvj&amp;quot;,&amp;quot;link&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4261/optimized_1d80b2fe-0eb9-43cf-b4d6-cda28961deec.png&amp;quot;,&amp;quot;image&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4261/optimized_1d80b2fe-0eb9-43cf-b4d6-cda28961deec.png&amp;quot;,&amp;quot;caption&amp;quot;:&amp;quot;Parker, editor in chief Kate Lee’s Plus One, was, in fact, connected. (Image credit courtesy of Kate Lee.)&amp;quot;,&amp;quot;error&amp;quot;:null}"&gt;&lt;div&gt;&lt;a href="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4261/optimized_1d80b2fe-0eb9-43cf-b4d6-cda28961deec.png" target="_blank" rel="noopener noreferrer"&gt;&lt;img src="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4261/optimized_1d80b2fe-0eb9-43cf-b4d6-cda28961deec.png" alt="Parker, editor in chief Kate Lee’s Plus One, was, in fact, connected. (Image credit courtesy of Kate Lee.)"&gt;&lt;/a&gt;&lt;figcaption class="quill-image-caption"&gt;Parker, editor in chief Kate Lee’s Plus One, was, in fact, connected. (Image credit courtesy of Kate Lee.)&lt;/figcaption&gt;&lt;/div&gt;&lt;/div&gt;&lt;p&gt;&lt;br&gt;&lt;/p&gt;&lt;p&gt;That is not to say that they were not useful sometimes. Margot, staff writer &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@katie.parrott12" rel="noopener noreferrer" target="_blank"&gt;Katie Parrott&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;’s Plus One, &lt;u&gt;&lt;a href="https://every.to/working-overtime/ai-was-supposed-to-free-my-time-it-consumed-it" rel="noopener noreferrer" target="_blank"&gt;accelerated her writing process&lt;/a&gt;&lt;/u&gt;; R2-C2, Every CEO &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@danshipper" rel="noopener noreferrer" target="_blank"&gt;Dan Shipper&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;’s OpenClaw, managed bug reports and feature requests for &lt;strong&gt;&lt;u&gt;&lt;a href="https://proofeditor.ai/?utm_source=everywebsite" rel="noopener noreferrer" target="_blank"&gt;Proof&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;, our agent-native document editor. But getting them to work how you wanted required constant upkeep. &lt;/p&gt;&lt;p&gt;The gap between that vision and reality is why we’re changing the Plus One product so we can build something better.  &lt;/p&gt;&lt;p&gt;We’re more bullish than ever that agents will &lt;u&gt;&lt;a href="https://every.to/context-window/every-is-half-agent-now" rel="noopener noreferrer" target="_blank"&gt;transform the workplace&lt;/a&gt;&lt;/u&gt;. But the first iteration of the product taught us that the workplace agent we initially imagined—one AI assistant for &lt;u&gt;&lt;a href="https://every.to/podcast/transcript-we-gave-every-employee-an-ai-agent-here-s-what-happened" rel="noopener noreferrer" target="_blank"&gt;every employee&lt;/a&gt;&lt;/u&gt;—was the wrong starting point. The next version of Plus One will operate more like &lt;u&gt;&lt;a href="https://every.to/p/what-i-learned-onboarding-our-ai-project-manager" rel="noopener noreferrer" target="_blank"&gt;shared team resources&lt;/a&gt;&lt;/u&gt; with defined jobs than individual pets that reflect back their owners’ personalities. &lt;/p&gt;&lt;p&gt;How we arrived here is a story in two parts, and it offers lessons for anyone figuring out the best way to add agents to their organization.&lt;/p&gt;&lt;h2&gt;The platform was the most immediate problem&lt;/h2&gt;&lt;p&gt;We built Plus One on &lt;u&gt;&lt;a href="https://docs.openclaw.ai/" rel="noopener noreferrer" target="_blank"&gt;OpenClaw&lt;/a&gt;&lt;/u&gt;, an open-source agent harness that’s powerful and inherently unstable. A harness is a software layer that wraps around an AI model, giving it the tools, context, permissions, and execution loop it needs to act like an agent. &lt;/p&gt;&lt;p&gt;The brainchild of a &lt;u&gt;&lt;a href="https://x.com/steipete?lang=en" rel="noopener noreferrer" target="_blank"&gt;single programmer&lt;/a&gt;&lt;/u&gt;, OpenClaw was revelatory when it took off earlier this year. It proved agents can autonomously execute all kinds of tasks on your behalf, from managing your calendar to making restaurant reservations, around the clock. But the scaffolding underneath operates more like an experimental product than a platform—OpenClaw makes updates quickly, which resolves existing issues but often causes new ones. (Hence the “Terminated” messages our Plus Ones were sending.) For people who like to tinker—ourselves included—that’s a justifiable trade-off. For everyone else, it’s a maintenance nightmare.&lt;/p&gt;&lt;p&gt;The traits that make a good workplace agent are the traits that make a good coworker: reliability, stability, and judgment. You need to trust that an agent remembers what it has access to, follows directions, and knows how to do its job. You don’t want to worry that it’s an upgrade away from forgetting everything you’ve told them and trained them to do. You also expect coworkers to absorb information from across the company to accrue tribal knowledge. A one-on-one employee only builds up context on your work, often missing out on what the rest of the organization is doing and how it might affect you.&lt;/p&gt;&lt;p&gt;At first, our plan to improve the Plus Ones’s performance was to switch harnesses to one that operated more reliably. The autonomous, always-on capabilities OpenClaw pioneered are becoming platform features at model companies like Anthropic and OpenAI. &lt;u&gt;&lt;a href="https://every.to/chain-of-thought/inside-anthropic-s-2026-developer-conference" rel="noopener noreferrer" target="_blank"&gt;Claude Managed Agents&lt;/a&gt;&lt;/u&gt;, Anthropic’s managed infrastructure for running autonomous agents, is the version we’re exploring most seriously. A more stable harness would let us redirect our energy from managing infrastructure to loading Plus Ones up with the custom skills, tools, and permissions that make them capable coworkers. &lt;/p&gt;&lt;h2&gt;We realized the structure was wrong, too&lt;/h2&gt;&lt;p&gt;The deeper we got into trying to fix the platform, the more we noticed something else that was holding people back from getting the most out of their AI counterparts. &lt;/p&gt;&lt;p&gt;Every time an agent broke, the person it belonged to had to fix it themselves. Even with a stable harness, agents require maintenance to perform. This was great for someone who likes tinkering—the maintenance and back-and-forth are &lt;u&gt;&lt;a href="https://every.to/p/i-hired-an-ai-to-do-my-chores-now-i-maintain-the-ai" rel="noopener noreferrer" target="_blank"&gt;part of the appeal&lt;/a&gt;&lt;/u&gt;. For every tinkerer, however, there are a lot of people who want the benefits of an agent without the obligation of having to manage and mend it.  &lt;/p&gt;&lt;p&gt;We had pitched Plus One originally with the idea that individuals would be responsible for the upkeep of their AI assistants. The upside of that would be more customization. The agent would remember your preferences, protect your information, and develop a personality through repeated interactions.&lt;/p&gt;&lt;p&gt;What we discovered is that, rather than agents as extensions of their creators, a more successful model is agents as coworkers who reliably perform parts of many different people’s jobs. This takes the maintenance burden off the individual.  &lt;/p&gt;&lt;p&gt;Imagine a shared analytics agent. Everyone on the team uses it for metrics-based work, and when its capabilities need to expand, one person updates the agent’s skills and the whole team benefits. In the personal-agent version of the same scenario, that same update has to happen across 10 different agents.&lt;/p&gt;&lt;p&gt;Team-based agents also solve a continuity problem. A personal agent’s value is tied to whomever trained it, and disappears if that employee leaves. A team agent with defined capabilities retains company context and knowledge, acting more like a &lt;u&gt;&lt;a href="https://every.to/p/what-i-learned-onboarding-our-ai-project-manager" rel="noopener noreferrer" target="_blank"&gt;project manager&lt;/a&gt;&lt;/u&gt;, sales lead, or chief of staff than a private assistant.&lt;/p&gt;&lt;h2&gt;What we’re building&lt;/h2&gt;&lt;p&gt;With the release of tools such as &lt;u&gt;&lt;a href="https://every.to/chain-of-thought/inside-anthropic-s-2026-developer-conference" rel="noopener noreferrer" target="_blank"&gt;Claude Managed Agents&lt;/a&gt;&lt;/u&gt; and, we hear, a similar capability from OpenAI soon, the infrastructure work that supports personal AI agents is largely handled by the model labs. That frees us up to focus on the layer that makes an agent useful at work: the workflows, permissions, skills, and shared context that makes it a trusted, versatile member of the team. It also lets us double down on the thing Every is best at: building AI-native ways of working out of our own experience using these tools every day.&lt;/p&gt;&lt;p&gt;The initial version of Plus One came connected to the Every ecosystem—&lt;strong&gt;&lt;u&gt;&lt;a href="https://cora.computer/?utm_source=everywebsite" rel="noopener noreferrer" target="_blank"&gt;Cora&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; to manage your email, &lt;strong&gt;&lt;u&gt;&lt;a href="https://writewithspiral.com/?utm_source=everywebsite" rel="noopener noreferrer" target="_blank"&gt;Spiral&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; to write in your voice, and &lt;strong&gt;&lt;u&gt;&lt;a href="https://www.proofeditor.ai/" rel="noopener noreferrer" target="_blank"&gt;Proof&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; to collaborate on live documents. That part isn’t going away. What we’re adding is a set of shared custom tools and skills on top of it, while still allowing each person to connect a team agent to their own Cora, Spiral, and Proof accounts.&lt;/p&gt;&lt;p&gt;The clearest version of where this is headed is a skill we built recently for our engineering team. At the end of each week, it scans support tickets in Intercom, identifies if anything is going wrong across our products, traces likely causes in GitHub, opens a Linear ticket, and tags the right person in Slack. In the next iteration of Plus One, that skill—along with many others—will be there from the start.&lt;/p&gt;&lt;p&gt;Because team agents are collaborative by nature, we’re also focused on the questions that come with shared use: how permissions should work, how much access different people should have through a shared agent, and how agents should &lt;u&gt;&lt;a href="https://every.to/context-window/you-re-the-manager-now" rel="noopener noreferrer" target="_blank"&gt;behave in Slack&lt;/a&gt;&lt;/u&gt; if they’re going to feel like good coworkers rather than intrusive bots.&lt;/p&gt;&lt;p&gt;There are still plenty of open questions. All of this is new—Claude Managed Agents only launched a month ago—and we’re figuring out human-agent dynamics in real time. We don’t know whether every department should have one agent or several, or whether agents should be maintained by a dedicated person or the whole team. We don’t know how much people will want to customize their interactions with a shared agent, and whether the long-term endpoint is a single, company-wide superagent or a roster of AI specialists. &lt;/p&gt;&lt;p&gt;What we do know: Agents are already transforming how work happens. The first iteration of Plus One taught us a lot about what people want from agents at work. It also made us much more excited for Plus One 2.0. &lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;&lt;a href="https://every.to/plus-one" rel="noopener noreferrer" target="_blank"&gt;Join the waitlist&lt;/a&gt; to be among the first to try Plus One 2.0.&lt;/em&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;em&gt;Thank you to&lt;/em&gt;&lt;strong&gt;&lt;em&gt; &lt;a href="https://every.to/@laura_27bbaf_1" rel="noopener noreferrer" target="_blank"&gt;Laura Entis&lt;/a&gt;&lt;/em&gt;&lt;/strong&gt;&lt;em&gt; for editorial support. &lt;/em&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;&lt;strong&gt;&lt;em&gt;&lt;a href="https://every.to/on-every/brandon-gell-joins-every-as-our-first-entrepreneur-in-residence" rel="noopener noreferrer" target="_blank"&gt;Brandon Gell&lt;/a&gt;&lt;/em&gt;&lt;/strong&gt; &lt;em&gt;is the chief operating officer at Every.&lt;/em&gt; &lt;em&gt;You can follow him on X at&lt;/em&gt; &lt;em&gt;&lt;a href="https://x.com/bran_don_gell" rel="noopener noreferrer" target="_blank"&gt;@bran_don_gell&lt;/a&gt;&lt;/em&gt; &lt;em&gt;and on&lt;/em&gt; &lt;em&gt;&lt;a href="https://www.linkedin.com/in/brandongell/" rel="noopener noreferrer" target="_blank"&gt;LinkedIn&lt;/a&gt;. &lt;/em&gt;&lt;strong&gt;&lt;em&gt;&lt;a href="https://every.to/@bigwilliestyle" rel="noopener noreferrer" target="_blank"&gt;Willie Williams&lt;/a&gt; &lt;/em&gt;&lt;/strong&gt;&lt;em&gt;is the head of platform at Every. You can follow him on X at &lt;a href="https://x.com/bigwilliestyle" rel="noopener noreferrer" target="_blank"&gt;@bigwilliestyle&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;To read more essays like this, subscribe to &lt;u&gt;&lt;a href="https://every.to/subscribe" rel="noopener noreferrer" target="_blank"&gt;Every&lt;/a&gt;&lt;/u&gt;, and follow us on X at &lt;u&gt;&lt;a href="http://twitter.com/every" rel="noopener noreferrer" target="_blank"&gt;@every&lt;/a&gt;&lt;/u&gt; and on &lt;u&gt;&lt;a href="https://www.linkedin.com/company/everyinc/" rel="noopener noreferrer" target="_blank"&gt;LinkedIn&lt;/a&gt;&lt;/u&gt;.&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;For sponsorship opportunities, reach out to sponsorships@every.to.&lt;/em&gt;&lt;/p&gt;&lt;div class="quill-button" data-source="{&amp;quot;id&amp;quot;:&amp;quot;quill-button-1769187301610&amp;quot;,&amp;quot;url&amp;quot;:&amp;quot;https://every.to/subscribe?source=post_button&amp;quot;,&amp;quot;text&amp;quot;:&amp;quot;Subscribe&amp;quot;}" id="quill-button-1769187301610"&gt;&lt;a href="https://every.to/subscribe?source=post_button"&gt;Subscribe&lt;/a&gt;&lt;/div&gt;</description>
      <author>Brandon Gell and Willie Williams / Source Code</author>
      <pubDate>2026-05-15 07:00:00 -0400</pubDate>
      <guid>https://every.to/source-code/we-gave-every-employee-an-ai-agent-here-s-what-we-re-doing-differently-now</guid>
      <link>https://every.to/source-code/we-gave-every-employee-an-ai-agent-here-s-what-we-re-doing-differently-now</link>
    </item>
    <item>
      <title>Opus 4.7 Reels Us Back In</title>
      <description>&lt;table&gt;&lt;tr&gt;&lt;td&gt;&lt;img alt="Context Window" src="https://d24ovhgu8s7341.cloudfront.net/uploads/publication/logo/94/small_context_windown_1.png" /&gt;&lt;/td&gt;&lt;td&gt;&lt;/td&gt;&lt;td&gt;&lt;table&gt;&lt;tr&gt;&lt;td&gt;by &lt;a href="https://every.to/@laura_27bbaf_1" itemprop="name"&gt;Laura Entis&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;in &lt;a href="https://every.to/context-window"&gt;Context Window&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;figure&gt;&lt;img src="https://d24ovhgu8s7341.cloudfront.net/uploads/post/cover/4228/full_page_cover_a1302c12b1e54812-Opus_4.7_Reels_Us_Back_In.png"&gt;&lt;figcaption&gt;Midjourney/Every illustration.&lt;/figcaption&gt;&lt;/figure&gt;&lt;p&gt;&lt;em&gt;Was this newsletter forwarded to you? &lt;u&gt;&lt;a href="https://every.to/account" rel="noopener noreferrer" target="_blank"&gt;Sign up&lt;/a&gt;&lt;/u&gt; to get it in your inbox.&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;h3&gt;Vibe shift&lt;/h3&gt;&lt;h4&gt;Did Opus 4.7 get better?&lt;/h4&gt;&lt;p&gt;If you’ve been following &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@danshipper" rel="noopener noreferrer" target="_blank"&gt;Dan Shipper&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;’s posts lately, you know that a large portion of the Every team has been Codex-pilled. When &lt;u&gt;&lt;a href="https://every.to/vibe-check/gpt-5-5" rel="noopener noreferrer" target="_blank"&gt;GPT-5.5 arrived&lt;/a&gt;&lt;/u&gt;, Codex got so much faster and steadier at coding and knowledge work that many of us &lt;u&gt;&lt;a href="https://www.youtube.com/watch?v=x9BNBcP_C7Q" rel="noopener noreferrer" target="_blank"&gt;made the switch&lt;/a&gt;&lt;/u&gt; from Claude Code.&lt;/p&gt;&lt;p&gt;Recently, however, we’ve observed that &lt;u&gt;&lt;a href="https://every.to/vibe-check/opus-4-7" rel="noopener noreferrer" target="_blank"&gt;Opus 4.7&lt;/a&gt;&lt;/u&gt; seems sharper than our initial tests last month. It proactively suggested that Every engineer &lt;strong&gt;&lt;u&gt;&lt;a href="https://www.linkedin.com/in/paridhi7/" rel="noopener noreferrer" target="_blank"&gt;Paridhi Agarwal&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; use multiple terminals to parallelize her work. “I’ve never seen it think about my setup like that!” she says. &lt;/p&gt;&lt;p&gt;When head of growth and known Codex convert &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@tedescau" rel="noopener noreferrer" target="_blank"&gt;Austin Tedesco&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; fired up Opus 4.7 over the weekend for a creative writing project, he was surprised by how good the results were. Compared to Codex, which Austin says operates like an “AP fact checker,” Opus 4.7 was closer to a senior magazine editor. Dan agrees: “Codex feels fast but thin in terms of thinking.”&lt;/p&gt;&lt;p&gt;On Tuesday, Anthropic released &lt;u&gt;&lt;a href="https://code.claude.com/docs/en/fast-mode" rel="noopener noreferrer" target="_blank"&gt;fast mode&lt;/a&gt;&lt;/u&gt; for Opus 4.7, which makes the model 2.5 times faster at a higher token cost. Combined with the model’s edge at planning, multitasking, and creative projects, fast mode is now &lt;strong&gt;&lt;a href="https://cora.computer/" rel="noopener noreferrer" target="_blank"&gt;Cora&lt;/a&gt;&lt;/strong&gt; general manager &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@kieran_1355" rel="noopener noreferrer" target="_blank"&gt;Kieran Klaassen&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;’s default model for synchronous work. &lt;/p&gt;&lt;div class="quill-block-image" id="quill-block-image-1778780851694-fyggu4dx2" data-source="{&amp;quot;dom_id&amp;quot;:&amp;quot;quill-block-image-1778780851694-fyggu4dx2&amp;quot;,&amp;quot;link&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4228/optimized_5ba15eb3-79cb-4a51-b5b9-05a28b44a35b.png&amp;quot;,&amp;quot;image&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4228/optimized_5ba15eb3-79cb-4a51-b5b9-05a28b44a35b.png&amp;quot;,&amp;quot;caption&amp;quot;:&amp;quot;Fast mode has the “same depth as 4.7” at 2.5 times the speed. (Image courtesy of Kieran Klaassen.)&amp;quot;,&amp;quot;error&amp;quot;:null}"&gt;&lt;div&gt;&lt;a href="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4228/optimized_5ba15eb3-79cb-4a51-b5b9-05a28b44a35b.png" target="_blank" rel="noopener noreferrer"&gt;&lt;img src="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4228/optimized_5ba15eb3-79cb-4a51-b5b9-05a28b44a35b.png" alt="Fast mode has the “same depth as 4.7” at 2.5 times the speed. (Image courtesy of Kieran Klaassen.)"&gt;&lt;/a&gt;&lt;figcaption class="quill-image-caption"&gt;Fast mode has the “same depth as 4.7” at 2.5 times the speed. (Image courtesy of Kieran Klaassen.)&lt;/figcaption&gt;&lt;/div&gt;&lt;/div&gt;&lt;h3&gt;Counterpoint&lt;/h3&gt;&lt;p&gt;&lt;u&gt;&lt;a href="https://x.com/danshipper/status/2054298827935334536" rel="noopener noreferrer" target="_blank"&gt;Online chatter&lt;/a&gt;&lt;/u&gt; about Opus 4.7’s apparent glow-up has been mixed. Does it feel smarter because of improvements to the harness? Patched bugs? Or are we getting better at using the model?&lt;/p&gt;&lt;p&gt;All fair hypotheses, but we found this one the most amusing: Opus 4.7 realizes that it’s the end of the school year.&lt;/p&gt;&lt;p&gt;When speaking last year on &lt;em&gt;The Ezra Klein Show&lt;/em&gt;, Wharton professor and AI researcher &lt;strong&gt;Ethan Mollick&lt;/strong&gt; explained that models have been shown to &lt;u&gt;&lt;a href="https://www.semafor.com/article/12/12/2023/is-chatgpt-getting-lazier-over-the-holidays" rel="noopener noreferrer" target="_blank"&gt;perform worse in December&lt;/a&gt;&lt;/u&gt; than in May, and the going theory is that the models &lt;u&gt;&lt;a href="https://x.com/emollick/status/1734280779537035478" rel="noopener noreferrer" target="_blank"&gt;internalize the idea of winter break&lt;/a&gt;&lt;/u&gt;.&lt;/p&gt;&lt;p&gt;Maybe Opus 4.7 just knows that it’s time to grind if it wants to pass AP English. &lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;h3&gt;Signal&lt;/h3&gt;&lt;h4&gt;The pull request as a credential theft&lt;/h4&gt;&lt;p&gt;Earlier this week, attackers published malicious versions of 42 official TanStack packages (a popular JavaScript toolkit used by web developers) on npm, the main public registry for such packages. Security researchers are calling the breach &lt;u&gt;&lt;a href="https://venturebeat.com/security/shai-hulud-worm-172-npm-pypi-packages-valid-provenance-ci-cd-audit" rel="noopener noreferrer" target="_blank"&gt;“Mini Shai-Hulud,”&lt;/a&gt;&lt;/u&gt; linking it to the &lt;u&gt;&lt;a href="https://www.cisa.gov/news-events/alerts/2025/09/23/widespread-supply-chain-compromise-impacting-npm-ecosystem" rel="noopener noreferrer" target="_blank"&gt;larger Shai-Hulud npm worm campaign&lt;/a&gt;&lt;/u&gt; that hit the JavaScript ecosystem last fall.&lt;/p&gt;&lt;div class="quill-block-image" id="quill-block-image-1778780851701-jzh0u73nr" data-source="{&amp;quot;dom_id&amp;quot;:&amp;quot;quill-block-image-1778780851701-jzh0u73nr&amp;quot;,&amp;quot;link&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4228/optimized_cec071c5-83c0-466d-aba0-34f8a7885014.png&amp;quot;,&amp;quot;image&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4228/optimized_cec071c5-83c0-466d-aba0-34f8a7885014.png&amp;quot;,&amp;quot;caption&amp;quot;:&amp;quot;The breach tactic spread to packages connected to Mistra and UiPath. (Photo courtesy of Waqqas Mir.)&amp;quot;,&amp;quot;error&amp;quot;:null}"&gt;&lt;div&gt;&lt;a href="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4228/optimized_cec071c5-83c0-466d-aba0-34f8a7885014.png" target="_blank" rel="noopener noreferrer"&gt;&lt;img src="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4228/optimized_cec071c5-83c0-466d-aba0-34f8a7885014.png" alt="The breach tactic spread to packages connected to Mistra and UiPath. (Photo courtesy of Waqqas Mir.)"&gt;&lt;/a&gt;&lt;figcaption class="quill-image-caption"&gt;The breach tactic spread to packages connected to Mistra and UiPath. (Photo courtesy of Waqqas Mir.)&lt;/figcaption&gt;&lt;/div&gt;&lt;/div&gt;&lt;p&gt;Instead of stealing a password, attackers opened a pull request that tricked TanStack’s own build system into running their code. When TanStack published a new version of the software, it contained malware designed to find credentials like cloud keys, GitHub tokens, and npm access. Researchers also spotted a &lt;u&gt;&lt;a href="https://x.com/dabit3/status/2053956743621648789" rel="noopener noreferrer" target="_blank"&gt;dead-man’s switch&lt;/a&gt;&lt;/u&gt;: If the stolen tokens were revoked before the malware was cleaned up, it could wipe the developer’s home directory on the way out. Shortly after the TanStack incident, npm packages belonging to enterprise automation company UiPath and French model-maker Mistral AI, among others, were breached using the same tactic.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;What it means: &lt;/strong&gt;The automated system that builds and ships code, rather than the code itself, is a new vulnerable spot in software supply chains. Teams that release software automatically should keep a ready-to-run audit (a Codex skill, Claude Code command, or other automated task) that, the moment a new breach is exposed, scans every repository for the compromised packages and flags for what’s affected, is likely safe, or needs human review. &lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;h3&gt;Data point&lt;/h3&gt;&lt;h4&gt;30 percent&lt;/h4&gt;&lt;p&gt;The drop in complaints of AI writing signs from &lt;strong&gt;&lt;u&gt;&lt;a href="https://writewithspiral.com/?utm_source=everywebsite" rel="noopener noreferrer" target="_blank"&gt;Spiral&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; users, following the addition of a “top edit” step in its draft writing process.&lt;/p&gt;&lt;p&gt;Starting in mid-April, every time Spiral drafts content for a user, the text is sent to a fast model—Gemini 2.5 Flash—for a top edit. The model has one job: Strip the draft of all AI tells, including em dashes, &lt;u&gt;&lt;a href="https://every.to/context-window/model-wars" rel="noopener noreferrer" target="_blank"&gt;“It’s not X. It’s Y”&lt;/a&gt;&lt;/u&gt; reframes, and LLM vocabulary favorites such as “shift,” “shape,” and “delve.” Marcus regularly updates the “AI writing tells” list to reflect anonymized user sentiment. “It’s almost like a crowdsourced editor function,” he says. &lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;h3&gt;Inside Every&lt;/h3&gt;&lt;h4&gt;What is an agent, anyway?&lt;/h4&gt;&lt;p&gt;An &lt;u&gt;&lt;a href="https://every.to/guides/claw-school" rel="noopener noreferrer" target="_blank"&gt;OpenClaw running 24/7&lt;/a&gt;&lt;/u&gt; on a dedicated Mac Mini is an agent. So is a Codex session, or a custom GPT, or a &lt;u&gt;&lt;a href="https://every.to/source-code/the-folder-is-the-agent" rel="noopener noreferrer" target="_blank"&gt;folder&lt;/a&gt;&lt;/u&gt;. “It can be managed, it can be in the cloud, it can be on your computer,” Kieran says. “There are a trillion ways it can be an agent.”&lt;/p&gt;&lt;p&gt;The confusion emerges because the term agent—or any AI system that can take action or execute tasks autonomously—encompasses &lt;em&gt;a lot&lt;/em&gt;. &lt;/p&gt;&lt;p&gt;When nearly everything is an agent, the better question becomes what you want your agent to do. Dan breaks this into &lt;u&gt;&lt;a href="https://every.to/context-window/the-dawn-of-codex-native-apps" rel="noopener noreferrer" target="_blank"&gt;two categories&lt;/a&gt;&lt;/u&gt;: the agent you collaborate with, and the agent you delegate to. The former sharpens and extends your capabilities; the latter’s job is to execute without messing up or getting in the way.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Agent spotlight:&lt;/strong&gt; Inside Anthropic’s &lt;u&gt;&lt;a href="https://every.to/chain-of-thought/inside-anthropic-s-2026-developer-conference" rel="noopener noreferrer" target="_blank"&gt;Managed Agents&lt;/a&gt;&lt;/u&gt; console, &lt;u&gt;&lt;a href="https://writewithspiral.com/?utm_source=everywebsite" rel="noopener noreferrer" target="_blank"&gt;Spiral&lt;/a&gt;&lt;/u&gt;’s agents get their own versioned configuration, memory stores, custom tools, and credentials, and run in Anthropic’s cloud environment. It’s the versioned configuration, including the system prompt, that mainly determines how the agent works.&lt;/p&gt;&lt;p&gt;A small set of animating instructions—that’s an agent too.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;&lt;strong&gt;&lt;em&gt;&lt;a href="https://every.to/@laura_27bbaf_1" rel="noopener noreferrer" target="_blank"&gt;Laura Entis&lt;/a&gt;&lt;/em&gt;&lt;/strong&gt;&lt;em&gt; is a staff writer at Every. You can follow her on &lt;a href="https://www.linkedin.com/in/lauraentis/" rel="noopener noreferrer" target="_blank"&gt;LinkedIn&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;To read more essays like this, subscribe to &lt;u&gt;&lt;a href="https://every.to/subscribe" rel="noopener noreferrer" target="_blank"&gt;Every&lt;/a&gt;&lt;/u&gt;, and follow us on X at &lt;u&gt;&lt;a href="http://twitter.com/every" rel="noopener noreferrer" target="_blank"&gt;@every&lt;/a&gt;&lt;/u&gt; and on &lt;u&gt;&lt;a href="https://www.linkedin.com/company/everyinc/" rel="noopener noreferrer" target="_blank"&gt;LinkedIn&lt;/a&gt;&lt;/u&gt;.&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;We &lt;u&gt;&lt;a href="https://every.to/studio" rel="noopener noreferrer" target="_blank"&gt;build AI tools&lt;/a&gt;&lt;/u&gt; for readers like you. Write brilliantly with &lt;u&gt;&lt;a href="https://writewithspiral.com/" rel="noopener noreferrer" target="_blank"&gt;Spiral&lt;/a&gt;&lt;/u&gt;. Organize files automatically with &lt;/em&gt;&lt;strong&gt;&lt;em&gt;&lt;u&gt;&lt;a href="https://makeitsparkle.co/?utm_source=everyfooter" rel="noopener noreferrer" target="_blank"&gt;Sparkle&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;&lt;/strong&gt;&lt;em&gt;. Deliver yourself from email with &lt;u&gt;&lt;a href="https://cora.computer/" rel="noopener noreferrer" target="_blank"&gt;Cora&lt;/a&gt;&lt;/u&gt;. Dictate effortlessly with &lt;/em&gt;&lt;strong&gt;&lt;em&gt;&lt;u&gt;&lt;a href="https://monologue.to/" rel="noopener noreferrer" target="_blank"&gt;Monologue&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;&lt;/strong&gt;&lt;em&gt;. Collaborate with agents on documents with &lt;/em&gt;&lt;strong&gt;&lt;em&gt;&lt;a href="https://www.proofeditor.ai/" rel="noopener noreferrer" target="_blank"&gt;Proof&lt;/a&gt;&lt;/em&gt;&lt;/strong&gt;&lt;em&gt;. &lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;Help us scale the only subscription you need to stay at the edge of AI. Explore &lt;u&gt;&lt;a href="https://www.notion.so/Jobs-Every-25cca4f355ac80c5ad6ee7a6e93d6b4e?pvs=21" rel="noopener noreferrer" target="_blank"&gt;open roles at Every&lt;/a&gt;&lt;/u&gt;.&lt;/em&gt;&lt;/p&gt;&lt;div class="quill-button" data-source="{&amp;quot;id&amp;quot;:&amp;quot;quill-button-1769187301610&amp;quot;,&amp;quot;url&amp;quot;:&amp;quot;https://every.to/subscribe?source=post_button&amp;quot;,&amp;quot;text&amp;quot;:&amp;quot;Subscribe&amp;quot;}" id="quill-button-1769187301610"&gt;&lt;a href="https://every.to/subscribe?source=post_button"&gt;Subscribe&lt;/a&gt;&lt;/div&gt;</description>
      <author>Laura Entis / Context Window</author>
      <pubDate>2026-05-14 09:00:00 -0400</pubDate>
      <guid>https://every.to/context-window/opus-4-7-reels-us-back-in</guid>
      <link>https://every.to/context-window/opus-4-7-reels-us-back-in</link>
    </item>
    <item>
      <title>Mining Your Life for Context </title>
      <description>&lt;table&gt;&lt;tr&gt;&lt;td&gt;&lt;img alt="Context Window" src="https://d24ovhgu8s7341.cloudfront.net/uploads/publication/logo/94/small_context_windown_1.png" /&gt;&lt;/td&gt;&lt;td&gt;&lt;/td&gt;&lt;td&gt;&lt;table&gt;&lt;tr&gt;&lt;td&gt;by &lt;a href="https://every.to/@laura_27bbaf_1" itemprop="name"&gt;Laura Entis&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;in &lt;a href="https://every.to/context-window"&gt;Context Window&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;figure&gt;&lt;img src="https://d24ovhgu8s7341.cloudfront.net/uploads/post/cover/4194/full_page_cover_bb5f6b5b1eeab908-How_to_Mine_the_Context_of_Your_Life.png"&gt;&lt;figcaption&gt;Midjourney/Every illustration.&lt;/figcaption&gt;&lt;/figure&gt;&lt;p&gt;LLMs make a lot of life &lt;u&gt;&lt;a href="https://every.to/on-every/introducing-monologue-notes-record-every-meeting-call-and-voice-memo" rel="noopener noreferrer" target="_blank"&gt;searchable&lt;/a&gt;&lt;/u&gt;, from meeting transcripts to iMessages to half-formed morning thoughts, but all this context only helps if you know what you want to achieve. Today, we’re revisiting how AI entrepreneur &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@noah_1729" rel="noopener noreferrer" target="_blank"&gt;Noah Brier&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; uses Claude Code as a second brain to sharpen and expand his own ideas, Every head of growth &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@tedescau" rel="noopener noreferrer" target="_blank"&gt;Austin Tedesco&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; shares how Codex helped him spot the interruptions crowding out deeper work, and we offer a workflow for mining your scattered past insights into a coherent draft.&lt;/p&gt;&lt;p&gt;&lt;em&gt;Was this newsletter forwarded to you? &lt;u&gt;&lt;a href="https://every.to/account" rel="noopener noreferrer" target="_blank"&gt;Sign up&lt;/a&gt;&lt;/u&gt; to get it in your inbox.&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;h2&gt;&lt;strong&gt;Spotlight&lt;/strong&gt;&lt;/h2&gt;&lt;h4&gt;&lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@noah_1729" rel="noopener noreferrer" target="_blank"&gt;Noah Brier&lt;/a&gt;&lt;/u&gt;, AI entrepreneur and seer&lt;/strong&gt;&lt;/h4&gt;&lt;p&gt;Brier is a true AI early adopter. The cofounder of the AI consultancy &lt;u&gt;&lt;a href="https://www.alephic.com/" rel="noopener noreferrer" target="_blank"&gt;Alephic&lt;/a&gt;&lt;/u&gt;, Brier was all in on using Claude Code as a &lt;u&gt;&lt;a href="https://every.to/podcast/how-to-use-claude-code-as-a-thinking-partner" rel="noopener noreferrer" target="_blank"&gt;“second brain”&lt;/a&gt;&lt;/u&gt; for knowledge work back when most people still viewed the tool as a place to write code.&lt;/p&gt;&lt;p&gt;In September, Brier told &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@danshipper" rel="noopener noreferrer" target="_blank"&gt;Dan Shipper&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; on our podcast, &lt;em&gt;&lt;u&gt;&lt;a href="https://every.to/podcast" rel="noopener noreferrer" target="_blank"&gt;AI &amp;amp; I&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;, how he turned the coding app into a research, thinking, and writing partner by connecting it to thousands of his personal notes. Since then, he’s started thinking beyond his own productivity—how does AI make it easier or harder for an entire organization to stay working toward the same goal? For that, he has a new framework, &lt;u&gt;&lt;a href="https://every.to/thesis/the-culture-of-ai-engineering" rel="noopener noreferrer" target="_blank"&gt;announced in Every last week&lt;/a&gt;&lt;/u&gt;, that he calls the “pace layers” of AI engineering, drawn from &lt;strong&gt;Stewart Brand&lt;/strong&gt;’s system for describing how different parts of society change at different speeds. &lt;/p&gt;&lt;p&gt;Just as hooking up Claude Code to an ocean of personal information requires you to determine what is—and isn’t—worth surfacing, running a successful AI company relies on human judgment. Similarly, AI makes code free to produce, but it doesn’t make it easier to identify a product people actually want or orient an entire system of humans and agents around that vision.&lt;/p&gt;&lt;p&gt;Read Brier’s &lt;u&gt;&lt;a href="https://every.to/thesis/the-culture-of-ai-engineering" rel="noopener noreferrer" target="_blank"&gt;essay&lt;/a&gt;&lt;/u&gt; on the framework he uses to achieve alignment and then watch his &lt;em&gt;AI &amp;amp; I&lt;/em&gt; episode on &lt;a href="https://youtu.be/in7i-EVDDlk" rel="noopener noreferrer" target="_blank"&gt;YouTube&lt;/a&gt;, or listen on &lt;u&gt;&lt;a href="https://open.spotify.com/episode/3P6tNiFNbcp5B3nnFXpRId?si=m0BsGMkSSQajiObpYdCwCg" rel="noopener noreferrer" target="_blank"&gt;Spotify&lt;/a&gt;&lt;/u&gt; or &lt;a href="https://podcasts.apple.com/us/podcast/claude-code-can-be-your-second-brain/id1719789201?i=1000767592752" rel="noopener noreferrer" target="_blank"&gt;Apple Podcasts&lt;/a&gt;. Here’s a link to the &lt;u&gt;&lt;a href="https://every.to/podcast/transcript-how-to-use-claude-code-as-a-thinking-partner" rel="noopener noreferrer" target="_blank"&gt;episode transcript&lt;/a&gt;&lt;/u&gt;.&lt;/p&gt;&lt;div class="quill-block-image" id="quill-block-image-1778682936719" data-source="{&amp;quot;dom_id&amp;quot;:&amp;quot;quill-block-image-1778682936719&amp;quot;,&amp;quot;link&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4194/optimized_d071f954-ea08-4fef-b37a-2879522f7acb.jpg&amp;quot;,&amp;quot;image&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4194/optimized_d071f954-ea08-4fef-b37a-2879522f7acb.jpg&amp;quot;,&amp;quot;caption&amp;quot;:&amp;quot;Serial entrepreneur Noah Brier uses Claude Code as a second brain for knowledge work. (Photo courtesy of Sarah Jay Halliday for Every.)&amp;quot;,&amp;quot;error&amp;quot;:null}"&gt;&lt;div&gt;&lt;a href="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4194/optimized_d071f954-ea08-4fef-b37a-2879522f7acb.jpg" target="_blank" rel="noopener noreferrer"&gt;&lt;img src="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4194/optimized_d071f954-ea08-4fef-b37a-2879522f7acb.jpg" alt="Serial entrepreneur Noah Brier uses Claude Code as a second brain for knowledge work. (Photo courtesy of Sarah Jay Halliday for Every.)"&gt;&lt;/a&gt;&lt;figcaption class="quill-image-caption"&gt;Serial entrepreneur Noah Brier uses Claude Code as a second brain for knowledge work. (Photo courtesy of Sarah Jay Halliday for Every.)&lt;/figcaption&gt;&lt;/div&gt;&lt;/div&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;h2&gt;&lt;strong&gt;Data point&lt;/strong&gt;&lt;/h2&gt;&lt;h4&gt;&lt;strong&gt;671&lt;/strong&gt;&lt;/h4&gt;&lt;p&gt;That’s the number of times per day iMessage is active on Austin’s screen each day, according to &lt;u&gt;&lt;a href="https://developers.openai.com/codex/memories/chronicle" rel="noopener noreferrer" target="_blank"&gt;Chronicle&lt;/a&gt;&lt;/u&gt;, Codex’s screen-context memory feature that uses screenshots to analyze your computer activity. He’d like to get that number down to 150.&lt;/p&gt;&lt;p&gt;Reducing how much he opens and interacts with iMessage is just part of the productivity regimen Codex created when Austin had it use Chronicle to determine how he could use his computer more efficiently. Other directives include slashing interactions across Slack, email, and Chrome.&lt;/p&gt;&lt;p&gt;Austin is game—he’d like to do more &lt;u&gt;&lt;a href="https://every.to/p/the-agent-that-saved-my-brain" rel="noopener noreferrer" target="_blank"&gt;focused work&lt;/a&gt;&lt;/u&gt;, primarily by resisting the urge to bounce between apps and tabs and instead spend as much time as possible in the Codex app, where he can draft and review assets, emails, and Slack messages inside the in-app browser.&lt;/p&gt;&lt;p&gt;“I’m excited by the idea of keeping Codex open and staying focused. Then it can flag, ‘This is your one hour for comms stuff, go’—or even say, ‘Go to respond to this stuff, I’ve already drafted the responses for you,’” he says.&lt;/p&gt;&lt;p&gt;If you want your bad computer habits similarly analyzed, &lt;u&gt;&lt;a href="https://x.com/ajambrosino/status/2049839184110645691" rel="noopener noreferrer" target="_blank"&gt;paste the following&lt;/a&gt;&lt;/u&gt; into Codex: &lt;/p&gt;&lt;blockquote&gt;&lt;em&gt;What have I been doing very inefficiently on my computer (according to Chronicle). Make some recommendations. Be direct. Tell me what I need to hear.&lt;/em&gt;&lt;/blockquote&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;h2&gt;&lt;strong&gt;Steal this workflow&lt;/strong&gt;&lt;/h2&gt;&lt;h4&gt;&lt;strong&gt;Mine your own scattered thinking before you draft&lt;/strong&gt;&lt;/h4&gt;&lt;p&gt;By the time you sit down to write the article, strategy memo, or launch page, you’ve probably already expressed most of what you want to say across Slack threads, Notion documents, voice memos, and meeting transcripts. Here’s how to mine all that content for gold—and avoid the paralysis of the blank page.&lt;/p&gt;&lt;h4&gt;&lt;strong&gt;The workflow:&lt;/strong&gt;&lt;/h4&gt;&lt;p&gt;&lt;strong&gt;Capture by default, sort later.&lt;/strong&gt; &lt;strong&gt;&lt;u&gt;&lt;a href="https://www.monologue.to/?utm_source=everywebsite" rel="noopener noreferrer" target="_blank"&gt;Monologue&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; general manager &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@naveen_6804" rel="noopener noreferrer" target="_blank"&gt;Naveen Naidu&lt;/a&gt;&lt;/u&gt; &lt;/strong&gt;treats the app as a transit point: He hits record on meetings, user calls, conversations with coworkers, and his rambling early-morning thoughts, because he knows he can always come&lt;/p&gt;&lt;ol&gt;&lt;li&gt; back and pull what he needs. The tool matters less than the habit—pick one (&lt;u&gt;&lt;a href="https://every.to/on-every/introducing-monologue-effortless-voice-dictation" rel="noopener noreferrer" target="_blank"&gt;Monologue Notes&lt;/a&gt;&lt;/u&gt;, a voice memo app, whatever) and use it everywhere you do your thinking, not just at your desk.&lt;/li&gt;&lt;li&gt;&lt;strong&gt;Connect every source your agent can read.&lt;/strong&gt; Give your coding agent access to Slack, Notion, Google Drive, Monologue Notes, and your meeting transcripts. For anything without a connector, export the files into a folder that the agent can search. The goal is one searchable repository across every place your ideas live.&lt;/li&gt;&lt;li&gt;&lt;strong&gt;Name the deliverable and constrain the source.&lt;/strong&gt; Tell the agent what you’re drafting—article, strategy memo, launch page, go-to-market plan—and specify in your prompt (or project instructions) that it should pull only from things &lt;em&gt;you’ve&lt;/em&gt; already said to avoid drafts that blend your thinking with AI-generated concepts.&lt;/li&gt;&lt;/ol&gt;&lt;p&gt;&lt;strong&gt;Try it this week:&lt;/strong&gt; Connect your agent to the two or three places where most of your thinking lives—Slack and Notion are usually a good start, plus meeting transcripts if you have them. Then paste: &lt;/p&gt;&lt;blockquote&gt;&lt;em&gt;“Find everything I’ve said about [topic] across these sources. Group the strongest threads, cite the source for each, and turn them into a draft outline.”&lt;/em&gt;&lt;/blockquote&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;h3&gt;&lt;strong&gt;Discuss&lt;/strong&gt;&lt;/h3&gt;&lt;blockquote&gt;“I’ll use aggressively casual language, like, ‘hey yo, for real,’ or drop a bunch of exclamation points.”—&lt;strong&gt;Sarah Suzuki Harvard&lt;/strong&gt;, copywriter, in the &lt;em&gt;&lt;u&gt;&lt;a href="https://www.wsj.com/tech/ai/writers-are-going-to-extremes-to-prove-they-didnt-use-ai-46e7c3f7\" rel="noopener noreferrer" target="_blank"&gt;Wall Street Journal&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;&lt;/blockquote&gt;&lt;p&gt;LLMs have &lt;u&gt;&lt;a href="https://www.axios.com/2026/05/02/ai-changing-writing-speaking" rel="noopener noreferrer" target="_blank"&gt;flattened&lt;/a&gt;&lt;/u&gt; how most writing sounds. In response, professional writers are leaning into the colloquial and idiosyncratic, per the &lt;em&gt;Journal&lt;/em&gt;, peppering their prose with obscure references, run-on sentences, and intentional typos to prove it wasn’t machine-made. As AI-generated content &lt;u&gt;&lt;a href="https://www.404media.co/your-ai-use-is-breaking-my-brain/" rel="noopener noreferrer" target="_blank"&gt;consumes more of the internet&lt;/a&gt;&lt;/u&gt;, the split between polished predictability and curated weirdness will only widen.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;&lt;strong&gt;&lt;em&gt;&lt;a href="https://every.to/@laura_27bbaf_1" rel="noopener noreferrer" target="_blank"&gt;Laura Entis&lt;/a&gt;&lt;/em&gt;&lt;/strong&gt;&lt;em&gt; is a staff writer at Every. You can follow her on &lt;a href="https://www.linkedin.com/in/lauraentis/" rel="noopener noreferrer" target="_blank"&gt;LinkedIn&lt;/a&gt;. &lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;To read more essays like this, subscribe to &lt;u&gt;&lt;a href="https://every.to/subscribe" rel="noopener noreferrer" target="_blank"&gt;Every&lt;/a&gt;&lt;/u&gt;, and follow us on X at &lt;u&gt;&lt;a href="http://twitter.com/every" rel="noopener noreferrer" target="_blank"&gt;@every&lt;/a&gt;&lt;/u&gt; and on &lt;u&gt;&lt;a href="https://www.linkedin.com/company/everyinc/" rel="noopener noreferrer" target="_blank"&gt;LinkedIn&lt;/a&gt;&lt;/u&gt;.&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;For sponsorship opportunities, reach out to sponsorships@every.to.&lt;/em&gt;&lt;/p&gt;&lt;div class="quill-button" data-source="{&amp;quot;id&amp;quot;:&amp;quot;quill-button-1769187301610&amp;quot;,&amp;quot;url&amp;quot;:&amp;quot;https://every.to/subscribe?source=post_button&amp;quot;,&amp;quot;text&amp;quot;:&amp;quot;Subscribe&amp;quot;}" id="quill-button-1769187301610"&gt;&lt;a href="https://every.to/subscribe?source=post_button"&gt;Subscribe&lt;/a&gt;&lt;/div&gt;</description>
      <author>Laura Entis / Context Window</author>
      <pubDate>2026-05-13 07:00:00 -0400</pubDate>
      <guid>https://every.to/context-window/mining-your-life-for-context</guid>
      <link>https://every.to/context-window/mining-your-life-for-context</link>
    </item>
    <item>
      <title>The Fallacy of the 16-hour Agent</title>
      <description>&lt;table&gt;&lt;tr&gt;&lt;td&gt;&lt;img alt="Context Window" src="https://d24ovhgu8s7341.cloudfront.net/uploads/publication/logo/94/small_context_windown_1.png" /&gt;&lt;/td&gt;&lt;td&gt;&lt;/td&gt;&lt;td&gt;&lt;table&gt;&lt;tr&gt;&lt;td&gt;by &lt;a href="https://every.to/@katie.parrott12" itemprop="name"&gt;Katie Parrott&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;in &lt;a href="https://every.to/context-window"&gt;Context Window&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;figure&gt;&lt;img src="https://d24ovhgu8s7341.cloudfront.net/uploads/post/cover/4193/full_page_cover_4eb6d6f7d3d67eef-1.png"&gt;&lt;figcaption&gt;Midjourney/Every illustration.&lt;/figcaption&gt;&lt;/figure&gt;&lt;p&gt;New data on long-horizon AI reliability just dropped, and depending on which chart you saw, you either think autonomous AI has arrived or it’s still years away. Today, we break down which version of the research to trust, plus Perplexity shares its methodology for building agent skills that don’t rot in production, Every CEO &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@danshipper" rel="noopener noreferrer" target="_blank"&gt;Dan Shipper&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; turns his piano keyboard into a real-time Codex-powered music coach, and Gusto co-founder &lt;strong&gt;Edward Kim&lt;/strong&gt; warns that the office of the future is going to sound more like a sales floor.—&lt;em&gt;&lt;u&gt;&lt;a href="https://every.to/on-every/kate-lee-joins-every-as-editor-in-chief" rel="noopener noreferrer" target="_blank"&gt;Kate Lee&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;Was this newsletter forwarded to you? &lt;u&gt;&lt;a href="https://every.to/account" rel="noopener noreferrer" target="_blank"&gt;Sign up&lt;/a&gt;&lt;/u&gt; to get it in your inbox.&lt;/em&gt;&lt;/p&gt;&lt;div class="quill-button" data-source="{&amp;quot;id&amp;quot;:&amp;quot;quill-button-1769530239147&amp;quot;,&amp;quot;url&amp;quot;:&amp;quot;https://every.to/subscribe?source=post_button&amp;quot;,&amp;quot;text&amp;quot;:&amp;quot;Subscribe&amp;quot;}" id="quill-button-1769530239147"&gt;&lt;a href="https://every.to/subscribe?source=post_button"&gt;Subscribe&lt;/a&gt;&lt;/div&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;h2&gt;Signal&lt;/h2&gt;&lt;h4&gt;&lt;strong&gt;The 24/7 agent is nearly upon us—or is it?&lt;/strong&gt;&lt;/h4&gt;&lt;p&gt;The &lt;u&gt;&lt;a href="https://every.to/chain-of-thought/toward-a-definition-of-agi" rel="noopener noreferrer" target="_blank"&gt;holy grail&lt;/a&gt;&lt;/u&gt; of agentic AI has been long-horizon reliability—an agent to which you can hand a task and trust to still be on the right thread hours later, when context has decayed and there’s no human in the loop to catch a wrong turn. &lt;u&gt;&lt;a href="https://metr.org/" rel="noopener noreferrer" target="_blank"&gt;METR&lt;/a&gt;&lt;/u&gt;, a nonprofit that measures AI capabilities, released an update to its research showing how close we are to that autonomous future. &lt;/p&gt;&lt;p&gt;One chart from the update circulating online shows an early preview of Anthropic’s next model, &lt;u&gt;&lt;a href="https://every.to/context-window/every-is-half-agent-now#signal" rel="noopener noreferrer" target="_blank"&gt;Mythos&lt;/a&gt;&lt;/u&gt;, blowing past existing models and the 16-hour range that METR’s benchmark suite can reliably test—literally breaking the scale.&lt;/p&gt;&lt;div class="quill-block-image" id="quill-block-image-1778616282904-ut24i8yum" data-source="{&amp;quot;dom_id&amp;quot;:&amp;quot;quill-block-image-1778616282904-ut24i8yum&amp;quot;,&amp;quot;link&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4193/optimized_89f043c6-b30d-4d6d-b251-48a071db1ed0.png&amp;quot;,&amp;quot;image&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4193/optimized_89f043c6-b30d-4d6d-b251-48a071db1ed0.png&amp;quot;,&amp;quot;caption&amp;quot;:&amp;quot;Claude Mythos Preview reaches the edge of METR’s current measurement range at 50 percent success. METR cautions that results above 16 hours are unreliable with its current task suite. (Image courtesy of METR.)&amp;quot;,&amp;quot;error&amp;quot;:null}"&gt;&lt;div&gt;&lt;a href="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4193/optimized_89f043c6-b30d-4d6d-b251-48a071db1ed0.png" target="_blank" rel="noopener noreferrer"&gt;&lt;img src="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4193/optimized_89f043c6-b30d-4d6d-b251-48a071db1ed0.png" alt="Claude Mythos Preview reaches the edge of METR’s current measurement range at 50 percent success. METR cautions that results above 16 hours are unreliable with its current task suite. (Image courtesy of METR.)"&gt;&lt;/a&gt;&lt;figcaption class="quill-image-caption"&gt;Claude Mythos Preview reaches the edge of METR’s current measurement range at 50 percent success. METR cautions that results above 16 hours are unreliable with its current task suite. (Image courtesy of METR.)&lt;/figcaption&gt;&lt;/div&gt;&lt;/div&gt;&lt;p&gt;&lt;br&gt;&lt;/p&gt;&lt;p&gt;It’s important to note, however, that how many human hours a task takes is not the same as how long a model takes to run those same tasks. Duration, the way that METR’s benchmark uses it, stands in for &lt;em&gt;difficulty&lt;/em&gt;. As the nonprofit writes in the report’s FAQ: “AI agents are typically several times faster than humans on tasks they complete successfully.”&lt;/p&gt;&lt;p&gt;That last bit—tasks completed &lt;em&gt;successfully&lt;/em&gt;—adds another twist to the benchmark. The 16-plus hour measurement is based on a 50 percent success rate. A separate measurement of how LLMs perform at 80 percent reliability shows that Mythos can run tasks that would take humans a little over three hours. It’s a significant step up from the closest competitor measured, Gemini 3.1 Pro (METR doesn’t currently have measurements for &lt;u&gt;&lt;a href="https://every.to/vibe-check/opus-4-7" rel="noopener noreferrer" target="_blank"&gt;Opus 4.7&lt;/a&gt;&lt;/u&gt; or &lt;u&gt;&lt;a href="https://every.to/vibe-check/gpt-5-5" rel="noopener noreferrer" target="_blank"&gt;GPT-5.5&lt;/a&gt;&lt;/u&gt;). But it brings Mythos back down to earth. &lt;/p&gt;&lt;div class="quill-block-image" id="quill-block-image-1778616282911-fa33exxfd" data-source="{&amp;quot;dom_id&amp;quot;:&amp;quot;quill-block-image-1778616282911-fa33exxfd&amp;quot;,&amp;quot;link&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4193/optimized_265b34c9-1357-49eb-9eb5-2ad018d2e9c1.png&amp;quot;,&amp;quot;image&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4193/optimized_265b34c9-1357-49eb-9eb5-2ad018d2e9c1.png&amp;quot;,&amp;quot;caption&amp;quot;:&amp;quot;LLMs measured against METR’s time horizon test for completing tasks with 80 percent success, presented on a logarithmic scale. (Image courtesy of METR.)&amp;quot;,&amp;quot;error&amp;quot;:null}"&gt;&lt;div&gt;&lt;a href="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4193/optimized_265b34c9-1357-49eb-9eb5-2ad018d2e9c1.png" target="_blank" rel="noopener noreferrer"&gt;&lt;img src="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4193/optimized_265b34c9-1357-49eb-9eb5-2ad018d2e9c1.png" alt="LLMs measured against METR’s time horizon test for completing tasks with 80 percent success, presented on a logarithmic scale. (Image courtesy of METR.)"&gt;&lt;/a&gt;&lt;figcaption class="quill-image-caption"&gt;LLMs measured against METR’s time horizon test for completing tasks with 80 percent success, presented on a logarithmic scale. (Image courtesy of METR.)&lt;/figcaption&gt;&lt;/div&gt;&lt;/div&gt;&lt;p&gt;&lt;br&gt;&lt;/p&gt;&lt;p&gt;Both these things are true: Duration can be a useful proxy for difficulty, and benchmarks don’t reflect reality. “[They] don’t measure model capability alone,” &lt;u&gt;&lt;a href="https://x.com/danshipper/status/2053191885116571935" rel="noopener noreferrer" target="_blank"&gt;says&lt;/a&gt;&lt;/u&gt; Dan. “They measure model capability after a human has done the work of finding a prompt that lets the model’s capability appear.”&lt;/p&gt;&lt;h5&gt;&lt;strong&gt;What to do this week:&lt;/strong&gt;&lt;/h5&gt;&lt;p&gt;1. &lt;strong&gt;Figure out your longest agent run. &lt;/strong&gt;METR teaches us that duration might be a good approximation of difficulty. Ask: What’s the longest stretch you’ve trusted an agent on autopilot? If you don’t know, you can’t extend it.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;2. Extend your agent’s runtime by giving it a goal.&lt;/strong&gt; Last month, OpenAI shipped a new &lt;u&gt;&lt;a href="https://developers.openai.com/codex/use-cases/follow-goals" rel="noopener noreferrer" target="_blank"&gt;/goals&lt;/a&gt;&lt;/u&gt; command in Codex that allows agents to pursue objectives across multiple turns without checking in. Yesterday, Anthropic &lt;u&gt;&lt;a href="https://code.claude.com/docs/en/goal" rel="noopener noreferrer" target="_blank"&gt;introduced&lt;/a&gt;&lt;/u&gt; a similar command to the latest Claude Code version. Both are apt for long-running loops with clear criteria for success—and very much in line with what we’ve heard &lt;u&gt;&lt;a href="https://every.to/context-window/ai-work-is-splitting-in-two#ai-i-the-secrets-of-claudes-platform-from-the-team-that-built-it" rel="noopener noreferrer" target="_blank"&gt;from Claude’s platform team&lt;/a&gt;&lt;/u&gt;. Try it out today.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;3. Audit the effectiveness of your existing loops.&lt;/strong&gt; If you already have agents running overnight, “How long did your agent run?” is still a useful diagnostic—but ask it alongside, “With what guardrails, against what feedback signal, and at what verified accuracy?” &lt;/p&gt;&lt;h2&gt;&lt;hr class="quill-line"&gt;&lt;/h2&gt;&lt;h2&gt;Steal this workflow&lt;/h2&gt;&lt;h4&gt;Build your next agent skill like Perplexity does&lt;/h4&gt;&lt;p&gt;Creating a skill these days is relatively easy. Creating one that keeps working is not. We’ve seen skills that were running fine one day suddenly fire on the wrong request, fail to load when needed, or yield reports that weren’t as useful as they used to be. So the skill files get patched, growing longer every time the agent makes a mistake. But nobody can tell whether the latest edit helped or hurt.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Perplexity&lt;/strong&gt;, the AI search company building agentic research and browsing tools, recently &lt;u&gt;&lt;a href="https://research.perplexity.ai/articles/designing-refining-and-maintaining-agent-skills-at-perplexity" rel="noopener noreferrer" target="_blank"&gt;published its methodology&lt;/a&gt;&lt;/u&gt; for designing agent skills. The main lesson: Instead of starting with the skill, start the tests. Highlights from the post: &lt;/p&gt;&lt;ul&gt;&lt;li&gt;&lt;strong&gt;Write the evals first.&lt;/strong&gt; Pull five to 10 cases from production queries, known failures, and edge cases. Include negative examples—queries that should not invoke this skill.&lt;/li&gt;&lt;li&gt;&lt;strong&gt;Phrase triggers like a human would.&lt;/strong&gt; Start with, “Load when…” and use the language your users use. Perplexity’s example: Instead of “monitors pull requests,” try “babysit a PR,” “watch CI,” or “make sure this lands.” This way, the skill loads without your team having to use a specific command or technical phrase.&lt;/li&gt;&lt;li&gt;&lt;strong&gt;Write the body in principles, not procedures.&lt;/strong&gt; The model already knows commands; it needs direction on how to apply them. Instead of listing detailed steps to, say, checkout a new code branch, then cherry-pick files to edit, then check for conflicts, and so on,  Perplexity recommends instructions like, “Cherry-pick the commit onto a clean branch. Resolve conflicts preserving intent.”&lt;/li&gt;&lt;li&gt;&lt;strong&gt;Codify failures into lessons.&lt;/strong&gt; When the agent fails in production, write the failure mode to the skill file. The mistake becomes a standing instruction that guards against future mistakes. &lt;/li&gt;&lt;li&gt;&lt;strong&gt;Edit instructions rigorously. &lt;/strong&gt;Ask with every line you add: “Would the agent get this wrong without this?” If not, cut it. Every extra line adds context cost.&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;&lt;strong&gt;Try it this week:&lt;/strong&gt; Pick one skill your team wants to improve. Write 10 test cases—five it should handle, five it should refuse or route elsewhere. Run the current skill against them. The gap is your backlog.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;h2&gt;&lt;strong&gt;Discuss&lt;/strong&gt;&lt;/h2&gt;&lt;blockquote&gt;“The office of the future will sound more like a sales floor.”—&lt;strong&gt;Edward Kim&lt;/strong&gt;, cofounder of Gusto, in the&lt;em&gt; Wall Street Journal&lt;/em&gt;&lt;/blockquote&gt;&lt;p&gt;A &lt;em&gt;Wall Street Journal&lt;/em&gt; article this week about &lt;u&gt;&lt;a href="https://www.wsj.com/tech/typing-is-being-replaced-by-whisperingand-its-way-more-annoying-a804fee7" rel="noopener noreferrer" target="_blank"&gt;AI dictation tools entering the workplace&lt;/a&gt;&lt;/u&gt; treats verbal prompting and composition as a manners problem—an angle that shows that the more things change, the more they stay the same. &lt;/p&gt;&lt;p&gt;Every new work interface eventually creates etiquette. Email created reply-all politics. Slack created notification politics. Voice AI is about to create room-tone politics: when you can talk to your computer, how loudly, and around whom. Great news for nosy office neighbors, but for the rest of us, it’s one more reason to curse the invention of open floor plans. &lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;h2&gt;Inside Every&lt;/h2&gt;&lt;p&gt;This week, &lt;u&gt;&lt;a href="https://thinkingmachines.ai/blog/interaction-models/" rel="noopener noreferrer" target="_blank"&gt;Thinking Machines Lab&lt;/a&gt;&lt;/u&gt; and &lt;u&gt;&lt;a href="https://x.com/openaidevs/status/2053964133570412826" rel="noopener noreferrer" target="_blank"&gt;OpenAI&lt;/a&gt;&lt;/u&gt; both announced bets on the same future: AI that watches and responds in real time, instead of waiting for its turn. OpenAI shipped its Realtime-2 voice models; Thinking Machines previewed an interaction model that watches video and audio simultaneously. &lt;/p&gt;&lt;p&gt;While we’re all waiting to see how the labs’ visions roll, Dan used Codex to jerry-rig his own version.&lt;/p&gt;&lt;p&gt;On Saturday, he plugged his MIDI keyboard—a keyboard that translates notes into data a computer can read—into his laptop, opened &lt;u&gt;&lt;a href="https://x.com/danshipper/status/2053551046299959760" rel="noopener noreferrer" target="_blank"&gt;Codex&lt;/a&gt;&lt;/u&gt;, and asked it to build a piano app that would identify the chord he played—then keep watching and coach him as he practiced. The pattern generalizes to any live medium: writing in a document, drawing on a tablet, deadlifting in front of a phone. This is also the promise of hardware like Meta’s AR/VR glasses or &lt;u&gt;&lt;a href="https://every.to/napkin-math/don-t-dismiss-the-apple-vision-pro" rel="noopener noreferrer" target="_blank"&gt;Apple’s&lt;/a&gt;&lt;/u&gt; &lt;u&gt;&lt;a href="https://every.to/napkin-math/don-t-dismiss-the-apple-vision-pro" rel="noopener noreferrer" target="_blank"&gt;Vision&lt;/a&gt;&lt;/u&gt; &lt;u&gt;&lt;a href="https://every.to/napkin-math/ai-and-the-vision-pro-don-t-need-a-killer-app" rel="noopener noreferrer" target="_blank"&gt;Pro&lt;/a&gt;&lt;/u&gt;: AI that sees what you’re doing and responds in a way that’s useful. &lt;/p&gt;&lt;p&gt;Here’s how you can do it too:&lt;/p&gt;&lt;ol&gt;&lt;li&gt;&lt;strong&gt;Find the input pipe. &lt;/strong&gt;MIDI for instruments. Screen capture for writing or design. Camera plus a vision model for drawing or movement. Microphone for languages. &lt;/li&gt;&lt;li&gt;&lt;strong&gt;Have the agent build the watcher. &lt;/strong&gt;Ask Codex (or Claude Code) to write the app based on how you like to be coached. (For example, tell it to only provide one piece of feedback at a time, or to focus on one aspect of your technique and ignore another.)  &lt;/li&gt;&lt;li&gt;&lt;strong&gt;Tune the feedback as you go.&lt;/strong&gt; First responses will be generic (“good chord progression”). Tell the watcher what’s useful and what’s not—“flag wrong notes only,” “ignore dynamics,” “let me finish a phrase before cutting in.” &lt;/li&gt;&lt;/ol&gt;&lt;div class="quill-block-image" id="quill-block-image-1778616282919-p7tssat3r" data-source="{&amp;quot;dom_id&amp;quot;:&amp;quot;quill-block-image-1778616282919-p7tssat3r&amp;quot;,&amp;quot;link&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4193/optimized_b28be645-54f0-4c0c-81a5-a4ab7d26ad7f.png&amp;quot;,&amp;quot;image&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4193/optimized_b28be645-54f0-4c0c-81a5-a4ab7d26ad7f.png&amp;quot;,&amp;quot;caption&amp;quot;:&amp;quot;Dan’s Codex-native piano coach setup, with the coaching app pulled up in the in-app browser. (Image courtesy of Dan Shipper.)&amp;quot;,&amp;quot;error&amp;quot;:null}"&gt;&lt;div&gt;&lt;a href="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4193/optimized_b28be645-54f0-4c0c-81a5-a4ab7d26ad7f.png" target="_blank" rel="noopener noreferrer"&gt;&lt;img src="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4193/optimized_b28be645-54f0-4c0c-81a5-a4ab7d26ad7f.png" alt="Dan’s Codex-native piano coach setup, with the coaching app pulled up in the in-app browser. (Image courtesy of Dan Shipper.)"&gt;&lt;/a&gt;&lt;figcaption class="quill-image-caption"&gt;Dan’s Codex-native piano coach setup, with the coaching app pulled up in the in-app browser. (Image courtesy of Dan Shipper.)&lt;/figcaption&gt;&lt;/div&gt;&lt;/div&gt;&lt;p&gt;&lt;br&gt;&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Try it this week:&lt;/strong&gt; Pick a skill you want to get better at. Open the medium where you practice. Spend an evening with your coding agent building the smallest watcher you can—input in, feedback out. Next thing you know, you’ll have a tutor you can summon on demand.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;&lt;strong&gt;&lt;em&gt;&lt;a href="https://every.to/@katie.parrott12" rel="noopener noreferrer" target="_blank"&gt;Katie Parrott&lt;/a&gt;&lt;/em&gt;&lt;/strong&gt; &lt;em&gt;is a staff writer at Every. You can read more of her work in&lt;/em&gt; &lt;em&gt;&lt;a href="https://katieparrott.substack.com/" rel="noopener noreferrer" target="_blank"&gt;her newsletter&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;To read more essays like this, subscribe to &lt;u&gt;&lt;a href="https://every.to/subscribe" rel="noopener noreferrer" target="_blank"&gt;Every&lt;/a&gt;&lt;/u&gt;, and follow us on X at &lt;u&gt;&lt;a href="http://twitter.com/every" rel="noopener noreferrer" target="_blank"&gt;@every&lt;/a&gt;&lt;/u&gt; and on &lt;u&gt;&lt;a href="https://www.linkedin.com/company/everyinc/" rel="noopener noreferrer" target="_blank"&gt;LinkedIn&lt;/a&gt;&lt;/u&gt;.&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;We &lt;u&gt;&lt;a href="https://every.to/studio" rel="noopener noreferrer" target="_blank"&gt;build AI tools&lt;/a&gt;&lt;/u&gt; for readers like you. Write brilliantly with &lt;/em&gt;&lt;strong&gt;&lt;em&gt;&lt;u&gt;&lt;a href="https://writewithspiral.com/" rel="noopener noreferrer" target="_blank"&gt;Spiral&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;&lt;/strong&gt;&lt;em&gt;. Organize files automatically with &lt;/em&gt;&lt;strong&gt;&lt;em&gt;&lt;u&gt;&lt;a href="https://makeitsparkle.co/?utm_source=everyfooter" rel="noopener noreferrer" target="_blank"&gt;Sparkle&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;&lt;/strong&gt;&lt;em&gt;. Deliver yourself from email with &lt;/em&gt;&lt;strong&gt;&lt;em&gt;&lt;u&gt;&lt;a href="https://cora.computer/" rel="noopener noreferrer" target="_blank"&gt;Cora&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;&lt;/strong&gt;&lt;em&gt;. Dictate effortlessly with &lt;/em&gt;&lt;strong&gt;&lt;em&gt;&lt;u&gt;&lt;a href="https://monologue.to/" rel="noopener noreferrer" target="_blank"&gt;Monologue&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;&lt;/strong&gt;&lt;em&gt;. Collaborate with agents on documents with &lt;/em&gt;&lt;strong&gt;&lt;em&gt;&lt;a href="https://www.proofeditor.ai/" rel="noopener noreferrer" target="_blank"&gt;Proof&lt;/a&gt;&lt;/em&gt;&lt;/strong&gt;&lt;em&gt;. &lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;Help us scale the only subscription you need to stay at the edge of AI. Explore &lt;u&gt;&lt;a href="https://www.notion.so/Jobs-Every-25cca4f355ac80c5ad6ee7a6e93d6b4e?pvs=21" rel="noopener noreferrer" target="_blank"&gt;open roles at Every&lt;/a&gt;&lt;/u&gt;.&lt;/em&gt;&lt;/p&gt;&lt;div class="quill-button" data-source="{&amp;quot;id&amp;quot;:&amp;quot;quill-button-1769187301610&amp;quot;,&amp;quot;url&amp;quot;:&amp;quot;https://every.to/subscribe?source=post_button&amp;quot;,&amp;quot;text&amp;quot;:&amp;quot;Subscribe&amp;quot;}" id="quill-button-1769187301610"&gt;&lt;a href="https://every.to/subscribe?source=post_button"&gt;Subscribe&lt;/a&gt;&lt;/div&gt;</description>
      <author>Katie Parrott / Context Window</author>
      <pubDate>2026-05-12 16:00:00 -0400</pubDate>
      <guid>https://every.to/context-window/the-fallacy-of-the-16-hour-agent</guid>
      <link>https://every.to/context-window/the-fallacy-of-the-16-hour-agent</link>
    </item>
    <item>
      <title>Socrates as a Service</title>
      <description>&lt;table&gt;&lt;tr&gt;&lt;td&gt;&lt;/td&gt;&lt;td&gt;&lt;table&gt;&lt;tr&gt;&lt;td&gt;by &lt;a href="https://every.to/@eleanor_b03474_1" itemprop="name"&gt;Eleanor Warnock&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;figure&gt;&lt;img src="https://d24ovhgu8s7341.cloudfront.net/uploads/post/cover/4192/full_page_cover_a10cbabf60c56389-Socrates-as-a-Service.png"&gt;&lt;figcaption&gt;Midjourney/Every illustration.&lt;/figcaption&gt;&lt;/figure&gt;&lt;p&gt;&lt;em&gt;Was this newsletter forwarded to you? &lt;u&gt;&lt;a href="https://every.to/account" rel="noopener noreferrer" target="_blank"&gt;Sign up&lt;/a&gt;&lt;/u&gt; to get it in your inbox.&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;I’m a journalist and a communications expert. My job, in both roles, is to find ideas that people haven’t yet put into words—the anecdote that could become a front-page story, the framing that could crystallize a founder’s philosophy into something a customer remembers. &lt;/p&gt;&lt;p&gt;In an hour interview with someone, it might not be until minute 45 that we start getting into the good stuff. In two hours, there may only be one thing that stands out to me—a side story, a detail, some color. A little piece of gold dust. An investor I’ve worked closely with calls these “extraction sessions.” I call the people who do them well Socrates-as-a-service.&lt;/p&gt;&lt;p&gt;Those details and stories aren’t on the internet. They’re not in any model. And the model hasn’t replicated yet how I pull them out of people. The gap between what AI can do and what a great human questioner can surface is still wide—and it’s the gap where the best stories live. If you don’t have some way to surface that information in your organization, your brand and messaging are going to sound like all the other twice-boiled content out there. &lt;/p&gt;&lt;h2&gt;&lt;strong&gt;Osakan bread and the wisdom within &lt;/strong&gt;&lt;/h2&gt;&lt;p&gt;The stuff that I’m looking for has a name in management theory: “tacit knowledge.” The term comes from scientist and philosopher &lt;strong&gt;Michael Polanyi&lt;/strong&gt;, who defined it with the phrase, “We can know more than we can tell.” It’s the expertise and intuition that lives in our bodies and resists being turned into a document. &lt;/p&gt;&lt;p&gt;In a frequently cited &lt;a href="https://lumsa.it/sites/default/files/UTENTI/u95/LM51_ITA_The%20Knowledge-Creating%20Company.pdf" rel="noopener noreferrer" target="_blank"&gt;1991 article&lt;/a&gt;, Japanese management expert &lt;strong&gt;Ikujiro Nonaka &lt;/strong&gt;argued that while Western companies excelled at “information processing,” Japanese companies specialized in the “creation of knowledge,” through a feedback loop that turned tacit knowledge into a competitive advantage. His most memorable example: In the 1980s, the Osaka-based Matsushita Electric Company was struggling to get the kneading right in a bread machine. They sent a software developer to apprentice with a baker at a local hotel famous for its luscious loaves. The knowledge she brought back helped the team perfect the dough-stretching technology inside the machine and ultimately create a top-selling device. &lt;/p&gt;&lt;p&gt;I am sure that the lucky engineer asked the baker a lot of questions, but there was certainly a lot she absorbed just from watching. Indeed, Polanyi argued that tacit knowledge exists outside of numbers or symbolic language—the kind of systemization that AI requires to ingest information. &lt;/p&gt;&lt;p&gt;Many “bakers” from whom we try to extract tacit knowledge often don’t even know the depth of expertise they carry. And they certainly couldn’t tell you what questions you need to ask to access it. &lt;/p&gt;&lt;h2&gt;&lt;strong&gt;AI as an imperfect interlocutor &lt;/strong&gt;&lt;/h2&gt;&lt;p&gt;AI can do some of that questioning and, in some cases, do it well. At Every, we have an AI agent ask us questions &lt;u&gt;&lt;a href="https://every.to/source-code/how-we-run-a-25-person-company-on-four-ai-agents" rel="noopener noreferrer" target="_blank"&gt;when we write OKRs&lt;/a&gt;&lt;/u&gt;. The agent has ingested Every’s company strategy and has context on all the members of the organization. My colleague, &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@katie.parrott12" rel="noopener noreferrer" target="_blank"&gt;Katie Parrott&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;, has Claude interview her &lt;u&gt;&lt;a href="https://every.to/working-overtime/writing-with-ai-is-harder-than-you-think" rel="noopener noreferrer" target="_blank"&gt;before she writes an article&lt;/a&gt;&lt;/u&gt;. Those notes become the basis of an outline of the piece.&lt;/p&gt;&lt;p&gt;I would argue, however, that AI-driven extraction works well when the parameters are clear and the assignment structured, like writing an article or a plan for software. If you’re looking to turn over a completely new rock, interview someone about something they haven’t spoken much about before, or run the kind of open-ended information gathering work that happens when companies decide to rebrand. In those sessions, a chief marketing officer or branding agency will spend time speaking to members of the company and asking them open-ended questions about the business. The point is to keep things open, go wide, and see what comes up. &lt;/p&gt;&lt;p&gt;There’s a second problem: A human in the room can be surprised mid-conversation and abandon the plan—perhaps notice hesitation or dig into a thread that wasn’t on the list. A prompt mostly can’t. When I elicit insight from someone, I am applying my judgment about what is a good story in real time—judgment that’s been honed by years in news and communications. This mutual, live attention is something AI can’t capture because it’s not in the room. &lt;/p&gt;&lt;p&gt;The obvious objection is that this is a moving target—context windows and memory are improving to allow for more detailed, fluid conversations. &lt;u&gt;&lt;a href="https://every.to/p/what-is-taste-really" rel="noopener noreferrer" target="_blank"&gt;Taste&lt;/a&gt;&lt;/u&gt; won’t, however, won’t. Someone still has to decide which detail out of a two-hour conversation is the piece of gold dust.&lt;/p&gt;&lt;p&gt;Nonaka himself argued that the goal isn’t always to make tacit knowledge fully explicit. Because tacit knowledge is so personal and often so abstract, sometimes the right tool with which to communicate is a metaphor or an analogy—a form of language that can hold multiple ambiguous meanings. Eliciting that kind of language from someone takes its own form of tacit knowledge: the skills of a Socrates.&lt;/p&gt;&lt;h2&gt;&lt;strong&gt;Steal these techniques &lt;/strong&gt;&lt;/h2&gt;&lt;p&gt;So how can you surface those nuggets of gold? Despite the explosion of interview podcasts asking for multiple hours of your time, I find most hosts are not great at asking questions. The format demands an arc—a journey—which is the opposite of what you want when you’re trying to surface tacit knowledge. Real extraction zigs and zags, doubling back on itself and picking up something you said 20 minutes ago to pull a different thread, following gold, not audience interest. &lt;/p&gt;&lt;p&gt;Here are the techniques that I keep coming back to: &lt;/p&gt;&lt;ul&gt;&lt;li&gt;&lt;strong&gt;Warm people up. &lt;/strong&gt;We open up more once trust is established. I never skip the small talk at the beginning of a conversation, and I’ll often bring something we have in common: “I saw you just spoke about X—I’ve been thinking about that too.” NPR interviewer &lt;strong&gt;Terry Gross&lt;/strong&gt;’s favorite icebreaker question is, &lt;a href="https://www.nytimes.com/2018/11/17/style/self-care/terry-gross-conversation-advice.html" rel="noopener noreferrer" target="_blank"&gt;“&lt;/a&gt;&lt;u&gt;&lt;a href="https://www.nytimes.com/2018/11/17/style/self-care/terry-gross-conversation-advice.html" rel="noopener noreferrer" target="_blank"&gt;Tell me about yourself&lt;/a&gt;&lt;/u&gt;&lt;a href="https://www.nytimes.com/2018/11/17/style/self-care/terry-gross-conversation-advice.html" rel="noopener noreferrer" target="_blank"&gt;.”&lt;/a&gt; The question lets the person you are speaking to take the lead and protects you as the questioner from saying anything that might make them prickle while you are still warming up. &lt;/li&gt;&lt;li&gt;&lt;strong&gt;Ask a mix of general and specific questions. &lt;/strong&gt;When &lt;strong&gt;Lenny Rachitsky&lt;/strong&gt; &lt;u&gt;&lt;a href="https://review.firstround.com/reluctantly-influential-inside-lenny-rachitskys-demandingly-chill-life/" rel="noopener noreferrer" target="_blank"&gt;revealed the questions&lt;/a&gt;&lt;/u&gt; he sends to his podcast guests in preparation for the podcast, this combination stood out. For example, he asks them, “Anything you haven’t shared elsewhere that could be interesting to share in this forum?”—a very general question, and “What’s one pivotal moment in your career?”—which asks the guest to pinpoint one turning point. To extract unverbalized insights from someone, it helps to ask them to both think macro about their area of expertise as well as micro. &lt;/li&gt;&lt;li&gt;&lt;strong&gt;Come back to thoughts and drill in. &lt;/strong&gt;If a line of inquiry goes nowhere, don’t abandon it—go back later and try again from a different angle. The first pass often loosens wisdom up.&lt;/li&gt;&lt;li&gt;&lt;strong&gt;Repeat things back. &lt;/strong&gt;Repeating what someone said often helps them process their thoughts further, and they will often add additional detail they didn’t know they remembered. &lt;/li&gt;&lt;li&gt;&lt;strong&gt;Detail, detail, detail. &lt;/strong&gt;Specifics are where the real stuff lives. How did that make you feel in that moment? Why do you think that way? &lt;/li&gt;&lt;li&gt;&lt;strong&gt;Listen well. &lt;/strong&gt;Pulitzer Prize-winning radio journalist &lt;strong&gt;Studs Terkel&lt;/strong&gt; spent decades interviewing everyday people in Chicago, and was &lt;u&gt;&lt;a href="https://transom.org/2001/studs-terkel-in-conversation/" rel="noopener noreferrer" target="_blank"&gt;described by one subject&lt;/a&gt;&lt;/u&gt; as offering “a state of being, it’s a way of attending to, attention-ing another person.” That is what good listening looks like. &lt;/li&gt;&lt;li&gt;&lt;strong&gt;Ask about squirrels. &lt;/strong&gt;In his documentary about the debate surrounding the death penalty, &lt;strong&gt;Werner Herzog&lt;/strong&gt; &lt;u&gt;&lt;a href="https://www.youtube.com/watch?v=KRC1fkPoa8o" rel="noopener noreferrer" target="_blank"&gt;interviews a death row chaplain&lt;/a&gt;&lt;/u&gt; who, at the start of their conversation, delivers the polished answers he’s given 100 times about accompanying people in their final minutes. Then Herzog asks him about squirrels. Thrown off, he breaks down. The grief he feels about his job is laid bare. Ask people about the unscripted things.&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;Study this. Collect great questions you like. Build prompts to borrow these techniques for structured AI-driven sessions if you want. &lt;/p&gt;&lt;p&gt;But the judgment underneath these habits remains harder to transfer. It’s its own form of tacit knowledge. And for now, it still belongs to humans.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;&lt;strong&gt;&lt;em&gt;&lt;a href="https://every.to/@eleanor_b03474_1" rel="noopener noreferrer" target="_blank"&gt;Eleanor Warnock&lt;/a&gt;&lt;/em&gt;&lt;/strong&gt; &lt;em&gt;is the managing editor at Every. She has been a business journalist and editor at the&lt;/em&gt; Wall Street Journal &lt;em&gt;and the&lt;/em&gt; Financial Times&lt;em&gt;-backed&lt;/em&gt; &lt;em&gt;Sifted&lt;/em&gt;, &lt;em&gt;and is an advisor to Bek Ventures. Follow her on &lt;a href="https://www.linkedin.com/in/eleanor-warnock-6a671037?originalSubdomain=uk" rel="noopener noreferrer" target="_blank"&gt;LinkedIn&lt;/a&gt; and &lt;a href="https://www.eleanot.es/" rel="noopener noreferrer" target="_blank"&gt;Substack&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;To read more essays like this, subscribe to &lt;u&gt;&lt;a href="https://every.to/subscribe" rel="noopener noreferrer" target="_blank"&gt;Every&lt;/a&gt;&lt;/u&gt;, and follow us on X at &lt;u&gt;&lt;a href="http://twitter.com/every" rel="noopener noreferrer" target="_blank"&gt;@every&lt;/a&gt;&lt;/u&gt; and on &lt;u&gt;&lt;a href="https://www.linkedin.com/company/everyinc/" rel="noopener noreferrer" target="_blank"&gt;LinkedIn&lt;/a&gt;&lt;/u&gt;.&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;For sponsorship opportunities, reach out to sponsorships@every.to.&lt;/em&gt;&lt;/p&gt;&lt;div class="quill-button" data-source="{&amp;quot;id&amp;quot;:&amp;quot;quill-button-1769187301610&amp;quot;,&amp;quot;url&amp;quot;:&amp;quot;https://every.to/subscribe?source=post_button&amp;quot;,&amp;quot;text&amp;quot;:&amp;quot;Subscribe&amp;quot;}" id="quill-button-1769187301610"&gt;&lt;a href="https://every.to/subscribe?source=post_button"&gt;Subscribe&lt;/a&gt;&lt;/div&gt;</description>
      <author>Eleanor Warnock</author>
      <pubDate>2026-05-11 06:00:00 -0400</pubDate>
      <guid>https://every.to/p/socrates-as-a-service</guid>
      <link>https://every.to/p/socrates-as-a-service</link>
    </item>
    <item>
      <title>AI Work Is Splitting in Two</title>
      <description>&lt;table&gt;&lt;tr&gt;&lt;td&gt;&lt;img alt="Context Window" src="https://d24ovhgu8s7341.cloudfront.net/uploads/publication/logo/94/small_context_windown_1.png" /&gt;&lt;/td&gt;&lt;td&gt;&lt;/td&gt;&lt;td&gt;&lt;table&gt;&lt;tr&gt;&lt;td&gt;by &lt;a href="https://every.to/@Every%20Staff" itemprop="name"&gt;Every Staff&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;in &lt;a href="https://every.to/context-window"&gt;Context Window&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;figure&gt;&lt;img src="https://d24ovhgu8s7341.cloudfront.net/uploads/post/cover/4191/full_page_cover_981b2a88875c9dac-CW.png"&gt;&lt;figcaption&gt;Midjourney/Every illustration.&lt;/figcaption&gt;&lt;/figure&gt;&lt;p&gt;Hello, and happy Sunday! This week belonged to agents. OpenAI had a “low-key” launch party for &lt;u&gt;&lt;a href="https://every.to/vibe-check/gpt-5-5" rel="noopener noreferrer" target="_blank"&gt;GPT-5.5&lt;/a&gt;&lt;/u&gt; on May 5 at 5:55 p.m., a time chosen by the model itself. The following day Anthropic held its second annual &lt;u&gt;&lt;a href="https://every.to/chain-of-thought/inside-anthropic-s-2026-developer-conference" rel="noopener noreferrer" target="_blank"&gt;Code with Claude developer conference&lt;/a&gt;&lt;/u&gt;, where the company announced three new features for its Managed Agents product, along with—more suprisingly—a partnership to use SpaceX’s Colossus supercluster.&lt;/p&gt;&lt;p&gt;Every was on the ground in San Francisco at Code with Claude. Taken together with the way Codex has been showing up inside Every, it became easier to see that &lt;u&gt;&lt;a href="https://every.to/chain-of-thought/inside-anthropic-s-2026-developer-conference" rel="noopener noreferrer" target="_blank"&gt;battle lines are being drawn&lt;/a&gt;&lt;/u&gt; on two fronts: desktop apps for you and a model to collaborate with in real time as you work, and long-running agents like &lt;u&gt;&lt;a href="https://every.to/guides/claw-school" rel="noopener noreferrer" target="_blank"&gt;OpenClaw&lt;/a&gt;&lt;/u&gt; or Claude Managed Agents that teams hand off work to. It matches how agents inside Every &lt;u&gt;&lt;a href="https://every.to/context-window/the-dawn-of-codex-native-apps#inside-every" rel="noopener noreferrer" target="_blank"&gt;have bifurcated&lt;/a&gt;&lt;/u&gt; into ones we delegate to and ones we collaborate with, and signal we’re seeing from frontier labs &lt;u&gt;&lt;a href="https://every.to/context-window/the-dawn-of-codex-native-apps#signal" rel="noopener noreferrer" target="_blank"&gt;embedding employees&lt;/a&gt;&lt;/u&gt; in large enterprises.&lt;/p&gt;&lt;p&gt;Scroll down for a special weekend &lt;em&gt;AI &amp;amp; I&lt;/em&gt; with two engineering heads at Anthropic, workflows to steal for &lt;u&gt;&lt;a href="https://every.to/context-window/the-dawn-of-codex-native-apps#steal-this-workflow" rel="noopener noreferrer" target="_blank"&gt;hitting inbox zero with Codex&lt;/a&gt;&lt;/u&gt; or &lt;u&gt;&lt;a href="https://every.to/context-window/openai-flips-the-script#steal-this-workflow" rel="noopener noreferrer" target="_blank"&gt;deciding which AI tools are worth testing&lt;/a&gt;&lt;/u&gt;, and how Every COO &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@brandon_5263" rel="noopener noreferrer" target="_blank"&gt;Brandon Gell&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; &lt;u&gt;&lt;a href="https://every.to/context-window/openai-flips-the-script" rel="noopener noreferrer" target="_blank"&gt;instills curiosity&lt;/a&gt;&lt;/u&gt; in both his newborn son—and in himself. We’ve also been keeping an eye on the &lt;strong&gt;Elon Musk&lt;/strong&gt; versus OpenAI trial. Discovery has surfaced plenty of gossipy, occasionally jaw-dropping text messages, but so far none of it changes much for the day-to-day user.—&lt;em&gt;&lt;u&gt;&lt;a href="https://every.to/@kate_1767" rel="noopener noreferrer" target="_blank"&gt;Kate Lee&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;Was this newsletter forwarded to you? &lt;u&gt;&lt;a href="https://every.to/account" rel="noopener noreferrer" target="_blank"&gt;Sign up&lt;/a&gt;&lt;/u&gt; to get it in your inbox.&lt;/em&gt;&lt;/p&gt;&lt;h2&gt;&lt;hr class="quill-line"&gt;&lt;/h2&gt;&lt;h2&gt;‘AI &amp;amp; I’: The secrets of Claude’s platform from the team that built it&lt;/h2&gt;&lt;p&gt;In the future, you’ll be able to accomplish a goal by just giving Claude an outcome and a budget.&lt;/p&gt;&lt;p&gt;That’s the direction Anthropic is building in with its new Managed Agents features, announced at this week’s Code with Claude developer event. The basic idea: Claude, wrapped in a computer in the cloud, that you can spin up, scale, and manage as needed. Anthropic is taking on the infrastructure that kills most agent products, and making sure that it scales to meet the needs of agents running 24/7. &lt;/p&gt;&lt;p&gt;On a special episode of &lt;em&gt;AI &amp;amp; I &lt;/em&gt;recorded at Code with Claude, &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@danshipper" rel="noopener noreferrer" target="_blank"&gt;Dan Shipper&lt;/a&gt;&lt;/u&gt; &lt;/strong&gt;talks with Jiang and &lt;strong&gt;Katelyn Lesse&lt;/strong&gt;, head of engineering for the Claude platform, about what it takes to build an AI infrastructure platform. This is a must-watch for anyone trying to take an agent past the demo and into production. Watch &lt;a href="https://x.com/danshipper/status/2052860977696088126" rel="noopener noreferrer" target="_blank"&gt;on X&lt;/a&gt; or &lt;u&gt;&lt;a href="https://youtu.be/lLypHkIVLqc" rel="noopener noreferrer" target="_blank"&gt;YouTube&lt;/a&gt;&lt;/u&gt;, or listen on &lt;u&gt;&lt;a href="https://open.spotify.com/episode/417iXrL8S7epUz1F0GEHPZ?si=PmNN8NuYTIaW4IBmkTg_9Q" rel="noopener noreferrer" target="_blank"&gt;Spotify&lt;/a&gt;&lt;/u&gt; or &lt;a href="https://podcasts.apple.com/us/podcast/the-secrets-of-claudes-platform-from-the-team-who-built-it/id1719789201?i=1000766844063" rel="noopener noreferrer" target="_blank"&gt;Apple Podcasts&lt;/a&gt;.&lt;/p&gt;&lt;div class="quill-youtube" id="undefined" data-source="{&amp;quot;url&amp;quot;:&amp;quot;https://youtu.be/lLypHkIVLqc&amp;quot;,&amp;quot;height&amp;quot;:&amp;quot;400&amp;quot;,&amp;quot;youtube_id&amp;quot;:&amp;quot;lLypHkIVLqc&amp;quot;}" data-height="400" data-youtube-id="lLypHkIVLqc" style="max-height: 400px; overflow: hidden;"&gt;&lt;a href="https://youtu.be/lLypHkIVLqc" target="_blank" rel="noopener noreferrer"&gt;&lt;img src="https://img.youtube.com/vi/lLypHkIVLqc/maxresdefault.jpg" style="width: 100%; aspect-ratio: 16 / 9; display: block;"&gt;&lt;div class="play"&gt;&lt;img src="https://d24ovhgu8s7341.cloudfront.net/static/emails/youtube-logo.png"&gt;&lt;/div&gt;&lt;/a&gt;&lt;/div&gt;&lt;p&gt;Miss an episode? Catch up on Dan’s recent conversations with Stripe’s &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/context-window/compute-is-the-new-cash" rel="noopener noreferrer" target="_blank"&gt;Emily Glassberg Sands&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;, Every’s &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/context-window/every-is-half-agent-now" rel="noopener noreferrer" target="_blank"&gt;Brandon Gell&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;&lt;u&gt;&lt;a href="https://every.to/context-window/every-is-half-agent-now" rel="noopener noreferrer" target="_blank"&gt; and &lt;/a&gt;&lt;/u&gt;&lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/context-window/every-is-half-agent-now" rel="noopener noreferrer" target="_blank"&gt;Willie Williams&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;, Linear cofounder &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/context-window/if-saas-is-dead-linear-didn-t-get-the-memo" rel="noopener noreferrer" target="_blank"&gt;Karri Saarinen&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;, and others, and learn how they use AI to think, create, and relate.&lt;/p&gt;&lt;h2&gt;&lt;hr class="quill-line"&gt;&lt;/h2&gt;&lt;h2&gt;Knowledge base&lt;/h2&gt;&lt;p&gt;&lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/chain-of-thought/inside-anthropic-s-2026-developer-conference" rel="noopener noreferrer" target="_blank"&gt;“Inside Anthropic’s 2026 Developer Conference”&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; &lt;em&gt;by&lt;/em&gt; &lt;em&gt;&lt;u&gt;&lt;a href="https://every.to/@danshipper" rel="noopener noreferrer" target="_blank"&gt;Dan Shipper&lt;/a&gt;&lt;/u&gt;,&lt;/em&gt; &lt;em&gt;&lt;u&gt;&lt;a href="https://every.to/@marcus_fd8302_1" rel="noopener noreferrer" target="_blank"&gt;Marcus Moretti&lt;/a&gt;&lt;/u&gt;, and&lt;/em&gt; &lt;em&gt;&lt;u&gt;&lt;a href="https://every.to/@katie.parrott12" rel="noopener noreferrer" target="_blank"&gt;Katie Parrott&lt;/a&gt;&lt;/u&gt;/Chain of Thought:&lt;/em&gt; Dan and &lt;strong&gt;&lt;u&gt;&lt;a href="https://cora.computer" rel="noopener noreferrer" target="_blank"&gt;Cora&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; general manager &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@kieran_1355" rel="noopener noreferrer" target="_blank"&gt;Kieran Klaassen&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; attended Anthropic’s 2026 Code with Claude, and this piece is a report from the ground. The centerpiece is Anthropic’s new Managed Agents features, which &lt;strong&gt;&lt;u&gt;&lt;a href="https://writewithspiral.com" rel="noopener noreferrer" target="_blank"&gt;Spiral&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; general manager &lt;strong&gt;&lt;em&gt;&lt;u&gt;&lt;a href="https://every.to/@marcus_fd8302_1" rel="noopener noreferrer" target="_blank"&gt;Marcus Moretti&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;&lt;/strong&gt; has been testing in his workflows, as well as the new “Dreaming” feature Kieran is most excited about. Read this for what Anthropic announced, what mattered, and how the tools are already being used in practice.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/working-overtime/i-let-chatgpt-manage-my-workweek" rel="noopener noreferrer" target="_blank"&gt;“I Let ChatGPT Manage My Workweek”&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; &lt;em&gt;by&lt;/em&gt; &lt;em&gt;Katie Parrott/Working Overtime:&lt;/em&gt; &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@katie.parrott12" rel="noopener noreferrer" target="_blank"&gt;Katie Parrott&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; is a self-described disaster at project management, a gap she papered over for 15 years by keeping deadlines in her head and avoiding ambitious projects. As her work got more complex, that stopped being sustainable, so she built a ChatGPT agent that reads her OKRs, calendar, Notion, and Slack and tells her what to do next. Read this for the setup, the limits AI can’t fix, and the copyable prompt that powers the whole system.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/thesis/the-culture-of-ai-engineering" rel="noopener noreferrer" target="_blank"&gt;“The Culture of AI Engineering”&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; &lt;em&gt;by Noah Brier/Thesis: &lt;/em&gt;The “software factory” metaphor is everywhere in AI engineering, but Alephic cofounder &lt;strong&gt;Noah Brier&lt;/strong&gt; argues it’s the wrong one. Running a software company is less like &lt;strong&gt;Henry Ford&lt;/strong&gt;’s assembly line and more like &lt;strong&gt;Andy Warhol&lt;/strong&gt;’s studio: The hard problem isn’t throughput, it’s keeping everyone building the same vision. &lt;strong&gt;Brier&lt;/strong&gt; adapts &lt;strong&gt;Stewart Brand&lt;/strong&gt;’s pace layers framework into a five-level cultural stack to keep humans and agents aligned. Read this to understand why onboarding your agents matters as much as onboarding your engineers.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;&lt;u&gt;“&lt;a href="https://every.to/context-window/the-dawn-of-codex-native-apps" rel="noopener noreferrer" target="_blank"&gt;The Dawn of Codex-native Apps”&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; &lt;em&gt;by&lt;/em&gt; &lt;em&gt;Katie Parrott/Context Window:&lt;/em&gt; AI work is splitting into two modes—delegation and collaboration—and the new meta-skill is knowing which one fits the task. Read this to discover why the &lt;u&gt;&lt;a href="https://every.to/chain-of-thought/the-knowledge-economy-is-over-welcome-to-the-allocation-economy" rel="noopener noreferrer" target="_blank"&gt;allocation economy&lt;/a&gt;&lt;/u&gt; thesis was only right about half the work, and what’s in the other half.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;&lt;u&gt;&lt;a href="http://every.to/context-window/openai-flips-the-script" rel="noopener noreferrer" target="_blank"&gt;“OpenAI Flips the Script”&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; &lt;em&gt;by Laura Entis/Context Window: &lt;/em&gt;Three months after &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@danshipper" rel="noopener noreferrer" target="_blank"&gt;Dan Shipper&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; wrote that OpenAI had catching up to do, he and head of growth &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@tedescau" rel="noopener noreferrer" target="_blank"&gt;Austin Tedesco&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; have made Codex their daily driver for strategy docs, recruiting, and other kinds of knowledge work. 🎧 🖥 Listen to their episode of &lt;em&gt;AI &amp;amp; I&lt;/em&gt; on &lt;u&gt;&lt;a href="https://open.spotify.com/episode/2HuoYt9ZV6CzY6foHL1vJe?si=98cb3DpLR266jg06bR2SXg" rel="noopener noreferrer" target="_blank"&gt;Spotify&lt;/a&gt;&lt;/u&gt; or &lt;u&gt;&lt;a href="https://podcasts.apple.com/us/podcast/why-we-switched-from-claude-code-to-codex/id1719789201?i=1000766460229" rel="noopener noreferrer" target="_blank"&gt;Apple Podcasts&lt;/a&gt;&lt;/u&gt;, or watch on &lt;u&gt;&lt;a href="https://x.com/danshipper/status/2052054077656252512" rel="noopener noreferrer" target="_blank"&gt;X&lt;/a&gt;&lt;/u&gt; or &lt;u&gt;&lt;a href="https://youtu.be/x9BNBcP_C7Q" rel="noopener noreferrer" target="_blank"&gt;YouTube&lt;/a&gt;&lt;/u&gt;.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;h2&gt;From Every Studio&lt;/h2&gt;&lt;h5&gt;&lt;strong&gt;Spiral lets you start from a blank page and stop mid-stream&lt;/strong&gt;&lt;/h5&gt;&lt;p&gt;&lt;strong&gt;&lt;u&gt;&lt;a href="https://writewithspiral.com/" rel="noopener noreferrer" target="_blank"&gt;Spiral&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; is one of the first products to use Claude’s new multi-agent feature in production. When you use the Spiral CLI to request multiple drafts, a &lt;u&gt;&lt;a href="https://every.to/chain-of-thought/inside-anthropic-s-2026-developer-conference" rel="noopener noreferrer" target="_blank"&gt;Managed Agent&lt;/a&gt;&lt;/u&gt; spins up multiple Opus-class subagents to write your drafts in parallel— cutting the response time by 20-30 seconds per draft. Spiral also shipped improvements to the core app flow. You can start a session with a blank draft in addition to a new chat message. You can stop a Spiral response mid-stream if you need to add or change something from your previous message. And the guard against AI tells in Spiral output has been improved based on user input.—&lt;em&gt;&lt;u&gt;&lt;a href="https://every.to/@marcus_fd8302_1" rel="noopener noreferrer" target="_blank"&gt;Marcus Moretti&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;h2&gt;Alignment &lt;/h2&gt;&lt;p&gt;&lt;strong&gt;The case for optimism. &lt;/strong&gt;The holy grail of any product is low marginal cost and high value. That is why software ate the world and why investors loved it. Biotechnology, however, is the polar opposite. A new drug costs hundreds of millions in research and development, then has to clear approval, then has to be manufactured, and out of every 100 candidates, only two or three reach the pharmacy shelf. The gross margins are &lt;em&gt;fine&lt;/em&gt; once a drug ships, but the pipeline to get there is long and expensive. &lt;/p&gt;&lt;p&gt;Biotech was never going to scale the way software did. Yet R&amp;amp;D productivity in biotech is rising for the &lt;u&gt;&lt;a href="https://www.stifel.com/newsletters/investmentbanking/bal/marketing/healthcare/biopharma_timopler/2026/BiopharmaMarketUpdate_010826.pdf" rel="noopener noreferrer" target="_blank"&gt;first time&lt;/a&gt;&lt;/u&gt; in many years, and the investors calling biotech a money pit are back at the table. There are a couple of reasons why.&lt;/p&gt;&lt;p&gt;We understand biology a lot better than we did even a decade ago, because we’re able to narrow the search space before we run an experiment. AlphaFold—Google DeepMind’s AI program for predicting the 3D shapes of protein—mapped roughly &lt;u&gt;&lt;a href="https://www.theguardian.com/technology/2022/jul/28/deepmind-uncovers-structure-of-200m-proteins-in-scientific-leap-forward" rel="noopener noreferrer" target="_blank"&gt;200 million&lt;/a&gt;&lt;/u&gt; in a year. Instead of spending years figuring out a target’s structure, researchers can now begin with that information already in front of them.&lt;/p&gt;&lt;p&gt;The second reason is the &lt;a href="https://x.com/EricTopol/status/2025265560901292279" rel="noopener noreferrer" target="_blank"&gt;collapse in the cost&lt;/a&gt; of reading the genome. Sequencing a single human genome cost around &lt;u&gt;&lt;a href="https://datahub.io/technology/genome-sequencing-costs" rel="noopener noreferrer" target="_blank"&gt;$100 million&lt;/a&gt;&lt;/u&gt; in 2001 and now costs about $200. We can sequence at population scale, and once you’re able to do so, you can start to see which genetic variants drive disease and which are noise.&lt;/p&gt;&lt;div class="quill-block-image" id="quill-block-image-1778272005562" data-source="{&amp;quot;dom_id&amp;quot;:&amp;quot;quill-block-image-1778272005562&amp;quot;,&amp;quot;link&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4191/optimized_58cb0d62-8605-475a-bc1c-dafad3e67505.png&amp;quot;,&amp;quot;image&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4191/optimized_58cb0d62-8605-475a-bc1c-dafad3e67505.png&amp;quot;,&amp;quot;caption&amp;quot;:&amp;quot;A turning point for personalized medicine. (Source: X/ErikTopol.)&amp;quot;,&amp;quot;error&amp;quot;:null}"&gt;&lt;div&gt;&lt;a href="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4191/optimized_58cb0d62-8605-475a-bc1c-dafad3e67505.png" target="_blank" rel="noopener noreferrer"&gt;&lt;img src="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4191/optimized_58cb0d62-8605-475a-bc1c-dafad3e67505.png" alt="A turning point for personalized medicine. (Source: X/ErikTopol.)"&gt;&lt;/a&gt;&lt;figcaption class="quill-image-caption"&gt;A turning point for personalized medicine. (Source: X/ErikTopol.)&lt;/figcaption&gt;&lt;/div&gt;&lt;/div&gt;&lt;p&gt;&lt;br&gt;&lt;/p&gt;&lt;p&gt;We now have maps of protein, genes, and cells that are starting to add up to a coherent picture of disease. For most of the history of medicine, we worked at the level of the organ, so we could see the disease but never its origins. Now we work at the level where disease happens—a genetic variant produces a misfolded protein, the misfolded protein disrupts a cellular pathway, and the cellular disruption is the disease. &lt;/p&gt;&lt;p&gt;Of course, the marginal cost of a drug will never be zero. But the marginal cost of asking what a disease is, and where to look for the answer, is collapsing. Lower R&amp;amp;D costs mean more breakthrough drugs, which means patients live longer and investors make money. The incentives, for once, point in the same direction.—&lt;em&gt;&lt;u&gt;&lt;a href="https://x.com/Ashwinreads" rel="noopener noreferrer" target="_blank"&gt;Ashwin Sharma&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;&lt;/p&gt;&lt;h3&gt;&lt;hr class="quill-line"&gt;&lt;/h3&gt;&lt;p&gt;&lt;em&gt;That’s all for this week! Be sure to follow Every on X at &lt;u&gt;&lt;a href="https://twitter.com/every" rel="noopener noreferrer" target="_blank"&gt;@every&lt;/a&gt;&lt;/u&gt; and on &lt;u&gt;&lt;a href="https://www.linkedin.com/company/everyinc/" rel="noopener noreferrer" target="_blank"&gt;LinkedIn&lt;/a&gt;&lt;/u&gt;.&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;We &lt;u&gt;&lt;a href="https://every.to/studio" rel="noopener noreferrer" target="_blank"&gt;build AI tools&lt;/a&gt;&lt;/u&gt; for readers like you. Write brilliantly with &lt;u&gt;&lt;a href="https://writewithspiral.com/" rel="noopener noreferrer" target="_blank"&gt;Spiral&lt;/a&gt;&lt;/u&gt;. Organize files automatically with &lt;/em&gt;&lt;strong&gt;&lt;em&gt;&lt;u&gt;&lt;a href="https://makeitsparkle.co/?utm_source=everyfooter" rel="noopener noreferrer" target="_blank"&gt;Sparkle&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;&lt;/strong&gt;&lt;em&gt;. Deliver yourself from email with &lt;u&gt;&lt;a href="https://cora.computer" rel="noopener noreferrer" target="_blank"&gt;Cora&lt;/a&gt;&lt;/u&gt;. Dictate effortlessly with &lt;u&gt;&lt;a href="https://monologue.to" rel="noopener noreferrer" target="_blank"&gt;Monologue&lt;/a&gt;&lt;/u&gt;. Work on documents with AI agents using &lt;/em&gt;&lt;strong&gt;&lt;em&gt;&lt;u&gt;&lt;a href="https://www.proofeditor.ai/?source=post_button" rel="noopener noreferrer" target="_blank"&gt;Proof&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;&lt;/strong&gt;&lt;em&gt;.&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;For sponsorship opportunities, reach out to sponsorships@every.to.&lt;/em&gt;&lt;/p&gt;&lt;div class="quill-button" data-source="{&amp;quot;id&amp;quot;:&amp;quot;quill-button-1778272150888&amp;quot;,&amp;quot;text&amp;quot;:&amp;quot;Upgrade to paid&amp;quot;,&amp;quot;url&amp;quot;:&amp;quot;https://every.to/subscribe?source=post_button&amp;quot;}" id="quill-button-1778272150888"&gt;&lt;a href="https://every.to/subscribe?source=post_button"&gt;Upgrade to paid&lt;/a&gt;&lt;/div&gt;</description>
      <author>Every Staff / Context Window</author>
      <pubDate>2026-05-10 12:00:00 -0400</pubDate>
      <guid>https://every.to/context-window/ai-work-is-splitting-in-two</guid>
      <link>https://every.to/context-window/ai-work-is-splitting-in-two</link>
    </item>
    <item>
      <title>The Culture of AI Engineering</title>
      <description>&lt;table&gt;&lt;tr&gt;&lt;td&gt;&lt;img alt="Thesis" src="https://d24ovhgu8s7341.cloudfront.net/uploads/publication/logo/98/small_Screenshot_2024-10-28_at_10.50.48_AM.png" /&gt;&lt;/td&gt;&lt;td&gt;&lt;/td&gt;&lt;td&gt;&lt;table&gt;&lt;tr&gt;&lt;td&gt;by &lt;a href="https://every.to/@noah_1729" itemprop="name"&gt;Noah Brier&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;in &lt;a href="https://every.to/thesis"&gt;Thesis&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;figure&gt;&lt;img src="https://d24ovhgu8s7341.cloudfront.net/uploads/post/cover/4190/full_page_cover_3b2c1b4e4c552792-Thesis_May_8.png"&gt;&lt;figcaption&gt;Sarah Jay Halliday/Every illustration.&lt;/figcaption&gt;&lt;/figure&gt;&lt;p&gt;&lt;strong&gt;&lt;em&gt;Noah Brier&lt;/em&gt;&lt;/strong&gt;&lt;em&gt; cofounded Percolate in 2011 and learned the CEO’s hardest job: keeping a whole company pointed in the same direction. Now, at his AI consultancy&lt;/em&gt; &lt;em&gt;&lt;u&gt;&lt;a href="https://www.alephic.com/" rel="noopener noreferrer" target="_blank"&gt;Alephic&lt;/a&gt;&lt;/u&gt;—and in his own work, where he uses Claude Code as a&lt;/em&gt; &lt;em&gt;&lt;u&gt;&lt;a href="https://www.youtube.com/watch?v=8V9tZwgjiRs" rel="noopener noreferrer" target="_blank"&gt;second brain&lt;/a&gt;&lt;/u&gt;—he’s facing that same problem with agents in the mix. AI was supposed to make coordination easier. Instead, Noah argues, it has created new coordination problems of its own. In this piece, he pushes back on the “software factory” metaphor and offers a framework, drawn from &lt;/em&gt;&lt;strong&gt;&lt;em&gt;Stewart Brand&lt;/em&gt;&lt;/strong&gt;&lt;em&gt;’s pace layers, for getting carbon and silicon to build the same thing.—&lt;u&gt;&lt;a href="https://every.to/@kate_1767" rel="noopener noreferrer" target="_blank"&gt;Kate Lee&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;Strong DM is a software company whose three-person AI team calls their system for autonomous code generation a &lt;u&gt;&lt;a href="https://factory.strongdm.ai/" rel="noopener noreferrer" target="_blank"&gt;“Software Factory.”&lt;/a&gt;&lt;/u&gt; Entrepreneur &lt;strong&gt;Dan Shapiro’&lt;/strong&gt;s &lt;u&gt;&lt;a href="https://www.danshapiro.com/blog/2026/01/the-five-levels-from-spicy-autocomplete-to-the-software-factory/" rel="noopener noreferrer" target="_blank"&gt;widely circulated framework for AI coding&lt;/a&gt;&lt;/u&gt; culminates in “the Dark Factory,” named after a Japanese robotics plant that &lt;u&gt;&lt;a href="https://en.wikipedia.org/wiki/Lights_out_(manufacturing)" rel="noopener noreferrer" target="_blank"&gt;runs with the lights off&lt;/a&gt;&lt;/u&gt;. &lt;u&gt;&lt;a href="http://factory.ai" rel="noopener noreferrer" target="_blank"&gt;Factory.ai&lt;/a&gt;&lt;/u&gt;, which has raised millions from Sequoia and Khosla Ventures, has built an entire business around the metaphor—its autonomous coding agents are called Droids. &lt;/p&gt;&lt;p&gt;I’ve been incorporating many of StrongDM’s concepts about agentic software development into our work at &lt;u&gt;&lt;a href="https://www.alephic.com/" rel="noopener noreferrer" target="_blank"&gt;Alephic&lt;/a&gt;&lt;/u&gt;, the consulting company I co-founded—but I have one fundamental disagreement: I think factory is the wrong metaphor.&lt;/p&gt;&lt;p&gt;If the hardest problem is making something people want, then the process of building software looks a lot more like &lt;strong&gt;Andy Warhol&lt;/strong&gt;’s factory than &lt;strong&gt;Henry Ford&lt;/strong&gt;’s. Both are focused on throughput, but Ford’s is focused on mechanization and stamping out identical cars with as little variance as possible. Warhol, on the other hand, was concerned with ensuring all work aligned with a single creative vision.&lt;/p&gt;&lt;p&gt;Ford’s factory—or more specifically, the assembly lines inside it—was designed to eliminate imperfections. &lt;u&gt;&lt;a href="https://en.wikipedia.org/wiki/Six_Sigma" rel="noopener noreferrer" target="_blank"&gt;Six Sigma&lt;/a&gt;&lt;/u&gt;, the quality methodology made famous by General Electric and beloved of manufacturers, is literally a measure of the defect rate. Quality starts with deciding what to build. This is why &lt;u&gt;&lt;a href="https://pmarchive.com/guide_to_startups_part4.html" rel="noopener noreferrer" target="_blank"&gt;product-market fit&lt;/a&gt;&lt;/u&gt; is the lingua franca of startups: If you haven’t built something the market needs, nothing else—including the quality of your code—matters.&lt;/p&gt;&lt;p&gt;Too much of the industry treats software as a problem to be optimized and solved. That may be true for code writing and testing, but the better metaphor is staring us in the face: It’s a software &lt;em&gt;company&lt;/em&gt;, not a software &lt;em&gt;factory&lt;/em&gt;.&lt;/p&gt;&lt;p&gt;Just as in the days before AI, the hardest problem for a business is still creating this vision and alignment around it—how to keep an entire team of humans, and now humans and agents (and humans with agents), building toward the same vision, from the system architecture down to the individual lines of code. As I’ve learned long before agents existed, achieving this is much more akin to building a startup than assembling a car. What follows is my attempt at a framework for keeping an entire system of humans and agents building the same thing. &lt;/p&gt;&lt;h2&gt;The alignment problem isn’t new—and AI didn’t solve it&lt;/h2&gt;&lt;p&gt;I ran into this alignment problem years ago, when I cofounded the company Percolate, a content marketing platform, in 2011. As we grew the business from zero to 100 people in less than three years, my job as CEO shifted from building the product to building a company capable of building the product. My agents were people, and my job was to design the system they worked within. Culture, I concluded, was one of the strongest levers I had.&lt;/p&gt;&lt;p&gt;As &lt;strong&gt;&lt;u&gt;&lt;a href="https://www.welcometothejungle.com/en/articles/ben-horowitz-culture-corporate-book" rel="noopener noreferrer" target="_blank"&gt;Ben Horowitz&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;&lt;u&gt;&lt;a href="https://www.welcometothejungle.com/en/articles/ben-horowitz-culture-corporate-book" rel="noopener noreferrer" target="_blank"&gt; put it&lt;/a&gt;&lt;/u&gt;, culture is “how your company makes decisions when you’re not there.” This was exactly what I needed: documents, tools, and rituals that helped each individual make the best possible decision without having to run every decision up the chain. I probably spent half my time on this, building a &lt;a href="https://review.firstround.com/this-startup-built-internal-tools-to-fuel-major-growth-heres-their-approach/" rel="noopener noreferrer" target="_blank"&gt;living culture document&lt;/a&gt;, running onboarding sessions for every new hire, and developing &lt;a href="https://review.firstround.com/this-startup-built-internal-tools-to-fuel-major-growth-heres-their-approach/" rel="noopener noreferrer" target="_blank"&gt;internal tools&lt;/a&gt; that automatically routed knowledge to the right people.&lt;/p&gt;&lt;p&gt;Every new technology promises to solve these coordination problems. But of course, nothing is that simple. What they do in reality is reshape the landscape around them and, in the process, create new problems that didn’t exist before. AI is no different.  &lt;/p&gt;&lt;p&gt;Open-source software offers an early glimpse of the kind of unexpected problems that AI can create: Whereas the primary challenge a few years ago was finding maintainers willing to contribute code on goodwill alone, today’s challenge is sifting through hundreds of crappy &lt;u&gt;&lt;a href="https://boristane.com/blog/slop-creep-enshittification-of-software/" rel="noopener noreferrer" target="_blank"&gt;AI-generated pull requests flooding GitHub&lt;/a&gt;&lt;/u&gt;. &lt;/p&gt;&lt;p&gt;Now, 15 years later, my audience at &lt;u&gt;&lt;a href="http://alephic.com" rel="noopener noreferrer" target="_blank"&gt;Alephic&lt;/a&gt;&lt;/u&gt; is not just the humans who work with me. Those humans are often paired with agents, and, increasingly, the agents themselves are delivering work independently. Yet the core problem is identical. &lt;/p&gt;&lt;p&gt;If you’ve used a coding agent for more than a week, you’ve already experienced this: The code works, but it often feels written by someone most definitely not you—ignoring obvious abstractions and stylistic norms that are present in the codebase. It looks, in other words, like a new engineer on the team who hasn’t been properly onboarded. We write onboarding documents and do training for our human colleagues, but most people don’t do this for agents. Yet.&lt;/p&gt;&lt;h2&gt;Pace layers of AI engineering &lt;/h2&gt;&lt;p&gt;I still have an onboarding document and set of activities every new hire goes through during their first week, including building a module in our homegrown learning system as their first coding task (a few recent editions were GPUs, &lt;u&gt;&lt;a href="https://huggingface.co/docs/optimum/concept_guides/quantization" rel="noopener noreferrer" target="_blank"&gt;quantization&lt;/a&gt;&lt;/u&gt;, and &lt;u&gt;&lt;a href="https://developers.openai.com/commerce" rel="noopener noreferrer" target="_blank"&gt;agentic commerce protocols&lt;/a&gt;&lt;/u&gt;).&lt;/p&gt;&lt;p&gt;But I am also building tools that go further and ensuring our code is maintainable, consistent, and built the way we’d want it built. &lt;/p&gt;&lt;p&gt;I think about our tooling as a kind of cultural stack, where standards inform architectures, which in turn inform specs, plans, and code. The layers are inspired by counterculture systems thinker &lt;strong&gt;&lt;u&gt;&lt;a href="https://jods.mitpress.mit.edu/pub/issue3-brand/release/2" rel="noopener noreferrer" target="_blank"&gt;Stewart Brand&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;&lt;u&gt;’&lt;a href="https://jods.mitpress.mit.edu/pub/issue3-brand/release/2" rel="noopener noreferrer" target="_blank"&gt;s pace layers framework&lt;/a&gt;&lt;/u&gt;. It’s a model for how society changes at different speeds, from nature, which shifts over millennia, to fashion, which can change by the day. The lower layers move slowly; the upper ones move fast.&lt;/p&gt;&lt;div class="quill-block-image" id="quill-block-image-1778227477634-kk4ynwrz9" data-source="{&amp;quot;dom_id&amp;quot;:&amp;quot;quill-block-image-1778227477634-kk4ynwrz9&amp;quot;,&amp;quot;link&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4190/optimized_cbd5b1ab-97c5-4ec3-9cb9-b3639a1ceb31.png&amp;quot;,&amp;quot;image&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4190/optimized_cbd5b1ab-97c5-4ec3-9cb9-b3639a1ceb31.png&amp;quot;,&amp;quot;caption&amp;quot;:&amp;quot;Stewart Brand’s Pace Layers framework offers a vision of how society works, from nature (changes over millennia) to fashion (changes daily). (Source: Stewart Brand.)&amp;quot;,&amp;quot;error&amp;quot;:null}"&gt;&lt;div&gt;&lt;a href="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4190/optimized_cbd5b1ab-97c5-4ec3-9cb9-b3639a1ceb31.png" target="_blank" rel="noopener noreferrer"&gt;&lt;img src="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4190/optimized_cbd5b1ab-97c5-4ec3-9cb9-b3639a1ceb31.png" alt="Stewart Brand’s Pace Layers framework offers a vision of how society works, from nature (changes over millennia) to fashion (changes daily). (Source: Stewart Brand.)"&gt;&lt;/a&gt;&lt;figcaption class="quill-image-caption"&gt;Stewart Brand’s Pace Layers framework offers a vision of how society works, from nature (changes over millennia) to fashion (changes daily). (Source: Stewart Brand.)&lt;/figcaption&gt;&lt;/div&gt;&lt;/div&gt;&lt;p&gt;&lt;br&gt;&lt;/p&gt;&lt;p&gt;Brand argued that much of societal tension exists where the layers meet—when fashion reshapes culture (think about how social media rewired our norms about privacy) or culture becomes governance (how shifting attitudes towards marriage equality became law). Fashion, in Brand’s framing, isn’t trivial—it’s the froth layer where society experiments quickly and irresponsibly, and the occasional good idea sifts down to reshape the slower layers below. All things are ultimately reliant on the layer beneath them. Culture is subject to the laws of nature, governance to the laws of culture. &lt;/p&gt;&lt;p&gt;Those boundaries can and do shift, but recognizing the layers and the differing speeds at which they move is central to understanding why systems resist change, and what it takes to change them.&lt;/p&gt;&lt;div class="quill-block-image" id="quill-block-image-1778229913316" data-source="{&amp;quot;dom_id&amp;quot;:&amp;quot;quill-block-image-1778229913316&amp;quot;,&amp;quot;link&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4190/optimized_afca7054-7a7f-4cc9-9547-32d46ec547ab.png&amp;quot;,&amp;quot;image&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4190/optimized_afca7054-7a7f-4cc9-9547-32d46ec547ab.png&amp;quot;,&amp;quot;caption&amp;quot;:&amp;quot;The “pace layers” of AI engineering help both humans and agents move in the same direction. (Credit: Noah Brier.)&amp;quot;,&amp;quot;error&amp;quot;:null}"&gt;&lt;div&gt;&lt;a href="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4190/optimized_afca7054-7a7f-4cc9-9547-32d46ec547ab.png" target="_blank" rel="noopener noreferrer"&gt;&lt;img src="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4190/optimized_afca7054-7a7f-4cc9-9547-32d46ec547ab.png" alt="The “pace layers” of AI engineering help both humans and agents move in the same direction. (Credit: Noah Brier.)"&gt;&lt;/a&gt;&lt;figcaption class="quill-image-caption"&gt;The “pace layers” of AI engineering help both humans and agents move in the same direction. (Credit: Noah Brier.)&lt;/figcaption&gt;&lt;/div&gt;&lt;/div&gt;&lt;p&gt;&lt;br&gt;&lt;/p&gt;&lt;p&gt;Here’s how I’ve been thinking about the “pace layers” of AI engineering and how we’re building tooling at Alephic to help both humans and agents move in the same direction:&lt;/p&gt;&lt;ol&gt;&lt;li&gt;&lt;strong&gt;Code is fashion now&lt;/strong&gt;. Whereas it once sat deeper in the stack, where it was slower moving and insulated by other layers, in a world of AI, code is free to produce and reproduce. The challenge is how to do it right: free of bugs at the macro level, and aligned with your own vision and best practices at the micro level. By the time we get to this layer, we have to trust that the layers beneath are strong enough to steer the system to the places we need it to go.&lt;/li&gt;&lt;li&gt;&lt;strong&gt;Plans sit beneath code&lt;/strong&gt;. Before an agent writes anything, it should pause to survey the problem—&lt;u&gt;&lt;a href="https://every.to/source-code/stop-coding-and-start-planning-be0b4fd1-5898-4b09-bfda-0b00ea0004fd" rel="noopener noreferrer" target="_blank"&gt;what are the possible approaches&lt;/a&gt;&lt;/u&gt;, and what are the trade-offs? Only after completing this step should the agent pick a direction and build. Many algorithms in computer science rely on the explore-exploit shift—when you time-box a broad search phase before zeroing in on a solution to run with—and this plan phase is no different. A plan doesn’t have to be a formal document, but it must separate the &lt;u&gt;&lt;a href="https://www.noahbrier.com/archives/2019/03/exploration-vs-exploitation" rel="noopener noreferrer" target="_blank"&gt;thinking from the doing&lt;/a&gt;&lt;/u&gt;. Without this pause, exploration and execution get mashed together.&lt;/li&gt;&lt;li&gt;&lt;strong&gt;Specs sit beneath plans&lt;/strong&gt;. A good plan needs a good specification. That can be a ticket (a task that needs doing), a document, or just a conversation, but it explains what we are building, why we are building it, how you know you’ve done it right, and, critically, what we are not tackling right now. That last bit is particularly important for overeager AI that wants to please by building everything you wanted and a little more. There’s a &lt;u&gt;&lt;a href="https://haskellforall.com/2026/03/a-sufficiently-detailed-spec-is-code" rel="noopener noreferrer" target="_blank"&gt;good debate&lt;/a&gt;&lt;/u&gt; in the engineering community about what constitutes a good spec. It’s the simplest set of directives that shrink the planning space: a goal, a set of acceptance criteria, and an explicit list of out-of-scope problems.&lt;/li&gt;&lt;li&gt;&lt;strong&gt;Architecture is the theory of the system&lt;/strong&gt;. I’ve been keeping an ARCHITECTURE.md doc in all my codebases for a while now, borrowing from computer scientist &lt;strong&gt;Peter Naur&lt;/strong&gt;’s &lt;u&gt;&lt;a href="https://cekrem.github.io/posts/programming-as-theory-building-naur/" rel="noopener noreferrer" target="_blank"&gt;idea&lt;/a&gt;&lt;/u&gt; that the real program isn’t the code, it’s the mental model the developers carry. The document shows how the business problem maps to the codebase, so you can predict where to find the code that solves this problem. It captures the key decisions and why they were made, and lays out the rules that must always hold, such as “no database queries outside the repository layer” and “no framework imports in the business logic.” Critically, it also names what’s still an open question, so AI doesn’t silently make architectural decisions for you, taking the codebase somewhere you didn’t intend.&lt;/li&gt;&lt;li&gt;&lt;strong&gt;Standards&lt;/strong&gt; &lt;strong&gt;are the foundation.&lt;/strong&gt; Some are general principles of good software-building; others reflect our specific beliefs about how software should be built. One of the insights that drove me to start the company was when, years ago, I asked a developer I had worked with for a decade if I could have all his &lt;u&gt;&lt;a href="https://www.alephic.com/glossary/linting" rel="noopener noreferrer" target="_blank"&gt;configuration files&lt;/a&gt;&lt;/u&gt;, the ones that encode his rules for how code should be written. When I applied this rulebook to my own work, I became a significantly better developer. His strict approach to linting, or automated rules that reject code with unused imports or superfluous definitions, meant my code wouldn’t even run unless it met his standards. Cutting corners was no longer an option. At Alephic, we enforce many of these standards with tools like tests and static analysis, which let the computer check your code automatically. But a lot of this guidance also lives in skills we distribute across the company, so people can use it in whatever harness they choose. The code-organization skill memorializes how we want team members to organize their codebases, and coding-best-practices hardcodes the stylistic and technical preferences our platform engineering team has established.&lt;/li&gt;&lt;/ol&gt;&lt;p&gt;With AI, we can take these ideas beyond the mechanisms of cultural exchange I had in my Percolate days (like documents and meetings) and encode them into tools that every person can interact with every day.&lt;/p&gt;&lt;p&gt;The layers at the bottom move the slowest, so they should get updated the least frequently. For instance, I could start keeping a document in a single project as a way to give agents context on how the codebase was organized. If it works well enough, I turn it into a &lt;u&gt;&lt;a href="https://every.to/vibe-check/vibe-check-claude-skills-need-a-share-button" rel="noopener noreferrer" target="_blank"&gt;skill&lt;/a&gt;&lt;/u&gt; so the rest of the team can adopt the pattern across their projects. Then, I can decide that it’s a fundamental piece of how we build and, eventually, a best practice I want to enforce for the entire team.&lt;/p&gt;&lt;h2&gt;Companies &amp;gt; factories&lt;/h2&gt;&lt;p&gt;While Henry Ford may be famous for the assembly line, he’s arguably more famous for his (&lt;u&gt;&lt;a href="https://quoteinvestigator.com/2011/07/28/ford-faster-horse/" rel="noopener noreferrer" target="_blank"&gt;likely&lt;/a&gt;&lt;/u&gt; apocryphal) quip about how if he asked people what they wanted, they’d say faster horses. Assembly lines exist to serve factories, just like factories exist to serve products, and products exist to serve companies. You don’t build a factory without an idea worth building it for.&lt;/p&gt;&lt;p&gt;The factory is one piece in a larger organization, where layers of co-dependent systems interact and move at different speeds. The interesting problems around alignment occur at the seams, where layers rub against each other: Is this a problem that should be solved with a meeting, a document, a skill, or a test? When does something graduate from a pattern in a codebase to something that should be established in all codebases?&lt;/p&gt;&lt;p&gt;At first glance, AI seems to smooth over these frictions. But that’s only true if you don’t scratch below the surface. What you find there is that the same problems that plague companies plague agents: incomplete information, overeager employees trying to solve the wrong problem, not wanting to admit you don’t know. The difference is speed. As &lt;strong&gt;&lt;u&gt;&lt;a href="https://mariozechner.at/posts/2026-03-25-thoughts-on-slowing-the-fuck-down/" rel="noopener noreferrer" target="_blank"&gt;Mario Zechner&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;, who built open-source coding agent &lt;u&gt;&lt;a href="https://github.com/badlogic/pi-mono" rel="noopener noreferrer" target="_blank"&gt;Pi&lt;/a&gt;&lt;/u&gt;, recently observed, the mess that used to take a large organization years to accumulate now arrives in weeks with a two-person team and a fleet of agents. &lt;/p&gt;&lt;p&gt;That is not a reason to retreat to being obsessed with defects. It’s a reason to take the harder problem seriously: how to keep an entire system of humans, agents, and the layers between them aligned. This problem has a decidedly human shape. Civilizations have been organizing large groups of autonomous agents to do good work for a very long time. The agents were just carbon instead of silicon.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;h2&gt;The man underneath the layers &lt;/h2&gt;&lt;p&gt;As part of this thesis, Every chatted to Noah about how he works and what inspires him. &lt;/p&gt;&lt;p&gt;&lt;strong&gt;If there’s a chessboard out:&lt;/strong&gt; there’s a good chance [my kids and I] will do that instead of reverting to less enriching activities like being on screens. That chess set was designed by some friends and inspired by the &lt;u&gt;&lt;a href="https://nymzo.world/" rel="noopener noreferrer" target="_blank"&gt;New York City outdoor chess scene&lt;/a&gt;&lt;/u&gt;. &lt;/p&gt;&lt;div class="quill-block-image" id="quill-block-image-1778227842039" data-source="{&amp;quot;dom_id&amp;quot;:&amp;quot;quill-block-image-1778227842039&amp;quot;,&amp;quot;link&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4190/optimized_cef7d2c0-5e7a-4d8c-b7be-dbc4e3219386.jpg&amp;quot;,&amp;quot;image&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4190/optimized_cef7d2c0-5e7a-4d8c-b7be-dbc4e3219386.jpg&amp;quot;,&amp;quot;caption&amp;quot;:&amp;quot;All photos courtesy of Sarah Jay Halliday for Every.&amp;quot;,&amp;quot;error&amp;quot;:null}"&gt;&lt;div&gt;&lt;a href="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4190/optimized_cef7d2c0-5e7a-4d8c-b7be-dbc4e3219386.jpg" target="_blank" rel="noopener noreferrer"&gt;&lt;img src="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4190/optimized_cef7d2c0-5e7a-4d8c-b7be-dbc4e3219386.jpg" alt="All photos courtesy of Sarah Jay Halliday for Every."&gt;&lt;/a&gt;&lt;figcaption class="quill-image-caption"&gt;All photos courtesy of Sarah Jay Halliday for Every.&lt;/figcaption&gt;&lt;/div&gt;&lt;/div&gt;&lt;p&gt;&lt;br&gt;&lt;/p&gt;&lt;p&gt;&lt;strong&gt;To keep me from checking email during calls: &lt;/strong&gt;I like to take notes on paper, currently with a &lt;u&gt;&lt;a href="https://www.kokuyostore.com/en_GB/campus/?srsltid=AfmBOopp2h0Wcth1923nAfxJcdA_TPDjD543sImOlcbLjsdLXkeMnUrv" rel="noopener noreferrer" target="_blank"&gt;Campus notebook&lt;/a&gt;&lt;/u&gt; and &lt;u&gt;&lt;a href="https://www.rotring.co.uk/pens-pencils/ballpoint-pens/rotring-600-ballpoint-pen/SP_1532012.html" rel="noopener noreferrer" target="_blank"&gt;rOtring 600 pen&lt;/a&gt;&lt;/u&gt;.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Re-reading the &lt;em&gt;&lt;u&gt;&lt;a href="https://www.alephic.com/sabotage" rel="noopener noreferrer" target="_blank"&gt;Simple Sabotage Field Manual&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;&lt;/strong&gt;&lt;a href="https://www.alephic.com/sabotage" rel="noopener noreferrer" target="_blank"&gt;:&lt;/a&gt; a 1944 document by the precursor to the CIA, I was struck by how closely the instructions for sabotage match the realities of corporate life in America. I hired a designer and printed a few hundred beautifully bound copies, which I gave away at &lt;u&gt;&lt;a href="https://www.alephic.com/sabotage" rel="noopener noreferrer" target="_blank"&gt;my conference&lt;/a&gt;&lt;/u&gt;&lt;strong&gt;. &lt;/strong&gt;&lt;/p&gt;&lt;div class="quill-block-image" id="quill-block-image-1778229819848" data-source="{&amp;quot;dom_id&amp;quot;:&amp;quot;quill-block-image-1778229819848&amp;quot;,&amp;quot;link&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4190/optimized_d918b4ad-c088-4f28-9f6d-77c31b1faa14.jpg&amp;quot;,&amp;quot;image&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4190/optimized_d918b4ad-c088-4f28-9f6d-77c31b1faa14.jpg&amp;quot;,&amp;quot;caption&amp;quot;:null,&amp;quot;error&amp;quot;:null}"&gt;&lt;div&gt;&lt;a href="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4190/optimized_d918b4ad-c088-4f28-9f6d-77c31b1faa14.jpg" target="_blank" rel="noopener noreferrer"&gt;&lt;img src="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4190/optimized_d918b4ad-c088-4f28-9f6d-77c31b1faa14.jpg" alt="Uploaded image"&gt;&lt;/a&gt;&lt;/div&gt;&lt;/div&gt;&lt;p&gt;&lt;strong&gt;A few books I’ve pulled off the shelf recently:&lt;/strong&gt; &lt;em&gt;Toyota Production System&lt;/em&gt; (I’m thinking a lot about how we can take inspiration from these kinds of organizing principles &lt;u&gt;&lt;a href="https://www.forwarddeployed.com/" rel="noopener noreferrer" target="_blank"&gt;to align agents)&lt;/a&gt;&lt;/u&gt;, &lt;em&gt;The Medium Is the Message&lt;/em&gt; (Marshall McLuhan is a hero of mine and this comes off the shelf frequently when I just want to bump my brain a bit), and &lt;em&gt;&lt;u&gt;&lt;a href="https://www.desunbound.com/chapter/chapter-7" rel="noopener noreferrer" target="_blank"&gt;Orchestrating Ambiguity&lt;/a&gt;&lt;/u&gt;&lt;/em&gt; (recently recommended to me, it’s a book of books about how to design for emergence in organizations).&lt;/p&gt;&lt;p&gt;&lt;strong&gt;I really love working:&lt;/strong&gt; before anyone else has woken up, but that also requires that I wake up before then. So mostly it’s just morning time after I get my kids on the bus. &lt;/p&gt;&lt;p&gt;&lt;strong&gt;My dog’s name:&lt;/strong&gt; is Kaiya. She’s two and a half, and very much a mutt.&lt;/p&gt;&lt;div class="quill-block-image" id="quill-block-image-1778229857184" data-source="{&amp;quot;dom_id&amp;quot;:&amp;quot;quill-block-image-1778229857184&amp;quot;,&amp;quot;link&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4190/optimized_937d56b3-ff0e-4d82-b02a-97b94eb80dbe.jpg&amp;quot;,&amp;quot;image&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4190/optimized_937d56b3-ff0e-4d82-b02a-97b94eb80dbe.jpg&amp;quot;,&amp;quot;caption&amp;quot;:null,&amp;quot;error&amp;quot;:null}"&gt;&lt;div&gt;&lt;a href="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4190/optimized_937d56b3-ff0e-4d82-b02a-97b94eb80dbe.jpg" target="_blank" rel="noopener noreferrer"&gt;&lt;img src="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4190/optimized_937d56b3-ff0e-4d82-b02a-97b94eb80dbe.jpg" alt="Uploaded image"&gt;&lt;/a&gt;&lt;/div&gt;&lt;/div&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;Noah Brier is the co-founder of Alephic. &lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;To read more essays like this, subscribe to &lt;u&gt;&lt;a href="https://every.to/subscribe" rel="noopener noreferrer" target="_blank"&gt;Every&lt;/a&gt;&lt;/u&gt;, and follow us on X at &lt;u&gt;&lt;a href="http://twitter.com/every" rel="noopener noreferrer" target="_blank"&gt;@every&lt;/a&gt;&lt;/u&gt; and on &lt;u&gt;&lt;a href="https://www.linkedin.com/company/everyinc/" rel="noopener noreferrer" target="_blank"&gt;LinkedIn&lt;/a&gt;&lt;/u&gt;.&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;For sponsorship opportunities, reach out to sponsorships@every.to.&lt;/em&gt;&lt;/p&gt;&lt;div class="quill-button" data-source="{&amp;quot;id&amp;quot;:&amp;quot;quill-button-1769187301610&amp;quot;,&amp;quot;url&amp;quot;:&amp;quot;https://every.to/subscribe?source=post_button&amp;quot;,&amp;quot;text&amp;quot;:&amp;quot;Subscribe&amp;quot;}" id="quill-button-1769187301610"&gt;&lt;a href="https://every.to/subscribe?source=post_button"&gt;Subscribe&lt;/a&gt;&lt;/div&gt;&lt;p&gt;&lt;br&gt;&lt;/p&gt;&lt;p&gt;&lt;br&gt;&lt;/p&gt;&lt;p&gt;‘&lt;/p&gt;&lt;p&gt;&lt;strong&gt;’’&lt;/strong&gt;&lt;/p&gt;</description>
      <author>Noah Brier / Thesis</author>
      <pubDate>2026-05-08 08:00:00 -0400</pubDate>
      <guid>https://every.to/thesis/the-culture-of-ai-engineering</guid>
      <link>https://every.to/thesis/the-culture-of-ai-engineering</link>
    </item>
    <item>
      <title>Inside Anthropic’s 2026 Developer Conference</title>
      <description>&lt;table&gt;&lt;tr&gt;&lt;td&gt;&lt;img alt="Chain of Thought" src="https://d24ovhgu8s7341.cloudfront.net/uploads/publication/logo/59/small_chain_of_thought_logo.png" /&gt;&lt;/td&gt;&lt;td&gt;&lt;/td&gt;&lt;td&gt;&lt;table&gt;&lt;tr&gt;&lt;td&gt;by &lt;a href="https://every.to/@danshipper" itemprop="name"&gt;Dan Shipper&lt;/a&gt;, &lt;a href="https://every.to/@marcus_fd8302_1" itemprop="name"&gt;Marcus Moretti&lt;/a&gt;, and &lt;a href="https://every.to/@katie.parrott12" itemprop="name"&gt;Katie Parrott&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;in &lt;a href="https://every.to/chain-of-thought"&gt;Chain of Thought&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;figure&gt;&lt;img src="https://d24ovhgu8s7341.cloudfront.net/uploads/post/cover/4189/full_page_cover_079dfa4c1b8120a4-anth.png"&gt;&lt;figcaption&gt;Midjourney/Every illustration.&lt;/figcaption&gt;&lt;/figure&gt;&lt;p&gt;&lt;em&gt;Was this newsletter forwarded to you? &lt;u&gt;&lt;a href="https://every.to/account" rel="noopener noreferrer" target="_blank"&gt;Sign up&lt;/a&gt;&lt;/u&gt; to get it in your inbox.&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;To our surprise, the biggest launch from Anthropic’s &lt;u&gt;&lt;a href="https://claude.com/code-with-claude" rel="noopener noreferrer" target="_blank"&gt;developer conference&lt;/a&gt;&lt;/u&gt; in San Francisco yesterday wasn’t a model or a feature. Instead, it was the company’s announcement of &lt;u&gt;&lt;a href="https://www.anthropic.com/news/higher-limits-spacex" rel="noopener noreferrer" target="_blank"&gt;a deal with SpaceX&lt;/a&gt;&lt;/u&gt; to allocate all of the capacity in the latter’s Colossus supercluster to Claude.&lt;/p&gt;&lt;p&gt;Anthropic has been riding a historic demand surge over the last year as Claude Code opened up a new wave of agentic coding for engineers and non-engineers alike. But compute constraints have caused friction even amongst its most die-hard fans—we’ve written previously about &lt;u&gt;&lt;a href="https://every.to/context-window/get-your-hands-dirty#signal" rel="noopener noreferrer" target="_blank"&gt;being frustrated&lt;/a&gt;&lt;/u&gt; with its OpenClaw restrictions and the speed of its latest models like &lt;u&gt;&lt;a href="https://every.to/vibe-check/opus-4-7" rel="noopener noreferrer" target="_blank"&gt;Opus 4.7&lt;/a&gt;&lt;/u&gt;.&lt;/p&gt;&lt;p&gt;The deal with SpaceX changes that equation. Anthropic has already doubled rate limits for subscription plans, removed peak-hour limits on Pro and Max accounts, and raised API rate limits by as much as almost 17 times for certain tiers.&lt;/p&gt;&lt;p&gt;Other than that, the big story is Claude Managed Agents, Anthropic’s hosted agent product. The company released &lt;u&gt;&lt;a href="https://claude.com/blog/new-in-claude-managed-agents" rel="noopener noreferrer" target="_blank"&gt;three new features&lt;/a&gt;&lt;/u&gt;:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;&lt;strong&gt;Multi-agent orchestration:&lt;/strong&gt; a coordinator agent that spins up subagents in parallel baked into the platform&lt;/li&gt;&lt;li&gt;&lt;strong&gt;Dreaming:&lt;/strong&gt; Anthropic’s general-purpose version of &lt;u&gt;&lt;a href="https://every.to/guides/compound-engineering" rel="noopener noreferrer" target="_blank"&gt;compound engineering&lt;/a&gt;&lt;/u&gt;, a feature that allows agents to learn from past sessions to improve between runs&lt;/li&gt;&lt;li&gt;&lt;strong&gt;Outcomes:&lt;/strong&gt; Anthropic’s answer to Codex’s /goals command, allowing developers to specify an outcome and run an agent in a loop until the outcome is achieved&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;By themselves, these features are nice but not groundbreaking. What’s more important  is that &lt;em&gt;what an AI platform is&lt;/em&gt; has changed. In the GPT-3 days, the platform was a text completion end-point: Send text in, get text out. Now, with Claude Managed Agents, the platform is an AI model with a harness and host computer—all provided with unlimited scaling by the model companies.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;&lt;a href="https://cora.computer" rel="noopener noreferrer" target="_blank"&gt;Cora&lt;/a&gt; &lt;/strong&gt;general manager&lt;strong&gt; &lt;a href="https://every.to/@kieran_1355" rel="noopener noreferrer" target="_blank"&gt;Kieran Klaassen&lt;/a&gt;&lt;/strong&gt; and I reported live from conference with our biggest takeaways, including the xAI compute deal, doubled Claude usage limits, Claude Managed Agents, and why the battle lines between OpenAI and Anthropic are starting to become clearer. Watch now:&lt;/p&gt;&lt;div class="quill-youtube" id="undefined" data-source="{&amp;quot;url&amp;quot;:&amp;quot;https://www.youtube.com/watch?v=4YNHb0XNV1A&amp;quot;,&amp;quot;height&amp;quot;:&amp;quot;400&amp;quot;,&amp;quot;youtube_id&amp;quot;:&amp;quot;4YNHb0XNV1A&amp;quot;}" data-height="400" data-youtube-id="4YNHb0XNV1A" style="max-height: 400px; overflow: hidden;"&gt;&lt;a href="https://www.youtube.com/watch?v=4YNHb0XNV1A" target="_blank" rel="noopener noreferrer"&gt;&lt;img src="https://img.youtube.com/vi/4YNHb0XNV1A/maxresdefault.jpg" style="width: 100%; aspect-ratio: 16 / 9; display: block;"&gt;&lt;div class="play"&gt;&lt;img src="https://d24ovhgu8s7341.cloudfront.net/static/emails/youtube-logo.png"&gt;&lt;/div&gt;&lt;/a&gt;&lt;/div&gt;&lt;p&gt;We also recorded a conversation with &lt;strong&gt;Angela Jiang&lt;/strong&gt;, head of product for the Claude platform, and &lt;strong&gt;Katelyn Lesse&lt;/strong&gt;, head of platform engineering. The full episode drops tomorrow on &lt;em&gt;&lt;a href="https://every.to/podcast" rel="noopener noreferrer" target="_blank"&gt;AI &amp;amp; I&lt;/a&gt;&lt;/em&gt;—highlights below.—&lt;em&gt;&lt;u&gt;&lt;a href="https://every.to/@danshipper" rel="noopener noreferrer" target="_blank"&gt;Dan Shipper&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;&lt;/p&gt;&lt;h2&gt;&lt;hr class="quill-line"&gt;&lt;/h2&gt;&lt;h2&gt;Vibe Check: Claude Managed Agents &lt;/h2&gt;&lt;h4&gt;Spiral general manager Marcus Moretti uses the platform’s new features&lt;/h4&gt;&lt;p&gt;Anthropic launched Claude Managed Agents in April, and since then, Every’s AI writing tool &lt;strong&gt;&lt;u&gt;&lt;a href="https://writewithspiral.com/" rel="noopener noreferrer" target="_blank"&gt;Spiral&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; has used the platform to power its API and command line interface (CLI), which lets developers and other agents talk to Spiral outside the web app. Claude Managed Agents run on Anthropic’s servers, instead of us having to run them on our own.&lt;/p&gt;&lt;p&gt;We set up a new Managed Agent in an afternoon and &lt;u&gt;&lt;a href="https://every.to/context-window/the-missing-layer-in-ai-adoption#spiral-is-experimenting-with-agent-to-agent-workflows" rel="noopener noreferrer" target="_blank"&gt;deployed it to power our API&lt;/a&gt;&lt;/u&gt; the next day. We’ve incorporated two of the new features Anthropic announced yesterday (memory and multi-agent orchestration) and are deploying the third (outcomes) soon.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Memory:&lt;/strong&gt; Every’s editorial and social expertise—how to write a good X post, for example—lives in an Anthropic-hosted global memory store. The memory store lets us avoid including every piece of editorial and social expertise in the agent system prompt—the standing instructions that tell the agent what to do every time it runs. When a user asks for a podcast description, the agent doesn’t need to also recall how to craft a great LinkedIn post. It only pulls the relevant expertise with each request, thereby making responses faster. &lt;/p&gt;&lt;p&gt;Each Spiral subscriber also gets their own personal memory store. When you tell Spiral that you prefer em-dashes over semicolons or that your company name is one word and not two, it will remember and apply your rules by default the next time you run it.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Multi-agent orchestration:&lt;/strong&gt; When users request a single draft of a piece of writing, one agent using &lt;u&gt;&lt;a href="https://every.to/vibe-check/opus-4-6" rel="noopener noreferrer" target="_blank"&gt;Opus 4.6&lt;/a&gt;&lt;/u&gt; Fast handles the workflow end-to-end. For multi-draft requests, a coordinator agent using &lt;u&gt;&lt;a href="https://every.to/vibe-check/vibe-check-claude-haiku-4-5-anthropic-cooked" rel="noopener noreferrer" target="_blank"&gt;Haiku 4.5&lt;/a&gt;&lt;/u&gt; spins up multiple Opus 4.6 Fast subagents to compose drafts in parallel. Before multiagent orchestration, multi-draft requests were handled serially, and each draft added 20 to 30 seconds to the overall request time. A multiagent approach also reduced our costs for multi-draft requests by about a third because we were able to use cheaper models for part of the work.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Outcomes:&lt;/strong&gt; Anthropic’s new outcomes capability is a feedback loop where one “grader” AI checks another AI’s work against a specified goal. Spiral’s main value proposition is writing quality, so we’re using outcomes to set up a rubric to ensure the writer agent’s output meets Spiral’s editorial standards and matches the user’s style guide. The rubric the grader AI uses is generated on-the-fly based on the global standards, the user’s writing style, and their writing preferences from memory.&lt;/p&gt;&lt;p&gt;Memory and multi-agent orchestration are live in production, and outcomes is coming soon. You can see the features in action by running npm i -g @every-env/spiral-cli &amp;amp;&amp;amp; spiral login or logging into Spiral and using the install command on the &lt;u&gt;&lt;a href="https://app.writewithspiral.com/settings/api-keys" rel="noopener noreferrer" target="_blank"&gt;Agent and API keys page&lt;/a&gt;&lt;/u&gt;.&lt;/p&gt;&lt;p&gt;Having set these features up in production, here’s what I think: &lt;/p&gt;&lt;p&gt;&lt;strong&gt;You are not totally locked into Anthropic’s universe.&lt;/strong&gt; Every engineer worries that when a company offers a hosted version of something, it will be hard to leave. With Managed Agents, the agents themselves, sessions, and memory are all stored on Anthropic machines, and the agents themselves can only be powered by Claude—a managed agent can’t run on GPT-5.5 or Gemini. &lt;/p&gt;&lt;p&gt;I’ve mitigated this lock-in in two ways: First, we save agent runs to our own database in addition to Anthropic’s. This way, chats from the API appear in the web app just as web chats do, but it doubles as a safety net. If we ever wanted to leave Anthropic, we’d have all our historical data. Second, the Managed Agents platform lets you define custom tools for the agents. Those tools run on our servers, which means we can use whatever model we want inside the tools themselves. The coordinator agent is locked to Claude, but we control the layer underneath. &lt;/p&gt;&lt;p&gt;&lt;strong&gt;Using multiple agents has trade-offs.&lt;/strong&gt; Multi-agent orchestration has allowed us to create multiple drafts faster and cheaper. However, coordination between agents adds overhead that prevents greater speed gains. Debugging also gets harder: If a Spiral draft comes back subpar, we have to investigate both the coordinator agent and the writer agent to identify the root cause. I’d recommend multi-agent orchestration only when your agent benefits from running subagents in parallel or using a mixture of models. Otherwise, a single agent works well.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Memory’s design is intuitive. &lt;/strong&gt;Each memory is just a folder of markdown files, and each memory store is attached to a session with instructions that tell the agent when to consult it. Anthropic designed this feature thoughtfully—they kept it simple.—&lt;em&gt;&lt;u&gt;&lt;a href="https://every.to/@marcus_fd8302_1" rel="noopener noreferrer" target="_blank"&gt;Marcus Moretti&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;&lt;/p&gt;&lt;h2&gt;&lt;hr class="quill-line"&gt;&lt;/h2&gt;&lt;h2&gt;The feature to watch: Dreaming &lt;/h2&gt;&lt;h4&gt;Cora general manager Kieran Klaassen sees his own philosophy mirrored back at him&lt;/h4&gt;&lt;p&gt;Kieran has spent the last year trying to get agents to learn his preferences instead of forcing him to restate them every time. That’s &lt;u&gt;&lt;a href="https://every.to/guides/compound-engineering" rel="noopener noreferrer" target="_blank"&gt;compound engineering&lt;/a&gt;&lt;/u&gt; in a nutshell—each run leaves the system better prepared for the next one. So when Anthropic officially announced dreaming at yesterday’s Code with Claude event, he had a &lt;u&gt;&lt;a href="https://every.to/context-window/if-saas-is-dead-linear-didn-t-get-the-memo" rel="noopener noreferrer" target="_blank"&gt;familiar feeling&lt;/a&gt;&lt;/u&gt;: The thing he’d been building was now a feature.&lt;/p&gt;&lt;p&gt;Dreaming is Anthropic’s name for a background process that &lt;u&gt;&lt;a href="https://claude.com/blog/new-in-claude-managed-agents" rel="noopener noreferrer" target="_blank"&gt;reviews an agent’s past sessions and memory stores&lt;/a&gt;&lt;/u&gt;, finds patterns, and rewrites memory so the agent improves between runs. OpenClaw &lt;u&gt;&lt;a href="https://every.to/context-window/every-is-half-agent-now#do-agents-dream-of-electric-sheep" rel="noopener noreferrer" target="_blank"&gt;introduced&lt;/a&gt;&lt;/u&gt; a similar feature in April, but Anthropic’s take seems more focused on what teams of agents learn collectively than what a single agent remembers. The system learns from repeated corrections, recurring mistakes, and workflows that run well—creating, over time, an institutional knowledge base.&lt;/p&gt;&lt;p&gt;The feature currently lives inside Claude Managed Agents as a &lt;u&gt;&lt;a href="https://claude.com/blog/new-in-claude-managed-agents" rel="noopener noreferrer" target="_blank"&gt;research preview&lt;/a&gt;&lt;/u&gt;, which is where Marcus has been testing it—with early success. Every plans to have its production agents dream as soon as the feature ships in a stable public release. But Kieran’s immediate question was: When is this coming to Claude Code? &lt;/p&gt;&lt;p&gt;Claude Code, after all, is where developers spend their days teaching agents the same repo quirks, the same testing rituals, the same “please don’t do it that way” preferences. Those preferences can go into memory files, but memory files get messy. They collect duplicates, stale rules, one-off notes, and contradictions—and as Marcus notes, memory introduces overhead, so you trade speed for quality every time you use it.&lt;/p&gt;&lt;p&gt;A dream cleans that up. It takes &lt;u&gt;&lt;a href="https://platform.claude.com/docs/en/managed-agents/dreams" rel="noopener noreferrer" target="_blank"&gt;up to 100 past sessions&lt;/a&gt;&lt;/u&gt; and produces a reorganized memory store with duplicates merged, contradicted entries replaced, and new insights pulled out—memory that organizes itself, in Marcus’s framing. If Anthropic brings that loop to Claude Code, memory starts to look less like a notes folder and more like accumulated taste.—&lt;em&gt;&lt;u&gt;&lt;a href="https://every.to/@katie.parrott12" rel="noopener noreferrer" target="_blank"&gt;Katie Parrott&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;h2&gt;Inside Anthropic &lt;/h2&gt;&lt;h3&gt;What the company’s platform team told us off-stage&lt;/h3&gt;&lt;p&gt;While at the conference, Dan sat down with &lt;strong&gt;Angela Jiang&lt;/strong&gt;, Anthropic’s head of product for the Claude platform, and &lt;strong&gt;Katelyn Lesse&lt;/strong&gt;, head of platform engineering, for a recorded conversation. Three things that stood out: &lt;/p&gt;&lt;p&gt;&lt;strong&gt;The generic harness is dead.&lt;/strong&gt; Angela told us that building a generalized harness that lets you switch any underlying model for a different one—standard practice even a few months ago—is a losing strategy. Different harnesses paired with the same model produce “drastically different” results on Anthropic’s own evaluations. When the team built memory for Managed Agents, they tested multiple harness designs, and the performance gaps were large enough to make model selection feel secondary. &lt;/p&gt;&lt;p&gt;Our own experience backs this up: Our agents run on Claude with a harness tuned specifically for how Claude works. If we don’t want to risk getting locked in, we have to—as Marcus writes above—build the harness in a way that lets us swap in GPT or Gemini. But Angela’s argument is that the bigger risk is leaving performance on the table.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Infrastructure is the real wall.&lt;/strong&gt; Katelyn told us that most people building agents expect the hard part to be the prompting, context window management, and tool setup required to get the most out of the model. In practice, everyone hits the same wall: infrastructure. They have to keep servers running, securely sandbox, prevent connection drops, and store transcripts. Before Marcus set up Managed Agents in an afternoon and deployed it the next day, we spent months on exactly that kind of plumbing.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Your agent needs a babysitter.&lt;/strong&gt; Dan raised this problem directly: Agents get stale fast, running old models and old prompts with nobody responsible for updating them. Our solution so far has been to assign every agent an owner to keep an eye on it. Katelyn said the Anthropic team has built skills to help agents upgrade themselves to new models. “The most &lt;u&gt;&lt;a href="https://every.to/chain-of-thought/toward-a-definition-of-agi" rel="noopener noreferrer" target="_blank"&gt;AGI&lt;/a&gt;&lt;/u&gt;-pilled people,” she added, “are running agents that monitor their agents.” &lt;/p&gt;&lt;p&gt;The full episode with Angela and Katelyn drops tomorrow on &lt;em&gt;&lt;a href="https://every.to/podcast" rel="noopener noreferrer" target="_blank"&gt;AI &amp;amp; I&lt;/a&gt;&lt;/em&gt;—we go deeper on where the platform is headed, what “outcome + budget” means as a design philosophy, and why Anthropic thinks Claude should eventually pick its own sub-agents.—&lt;em&gt;KP&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;&lt;strong&gt;&lt;em&gt;&lt;a href="https://every.to/@katie.parrott12" rel="noopener noreferrer" target="_blank"&gt;Katie Parrott&lt;/a&gt;&lt;/em&gt;&lt;/strong&gt; &lt;em&gt;is a staff writer at Every. You can read more of her work in&lt;/em&gt; &lt;em&gt;&lt;a href="https://katieparrott.substack.com/" rel="noopener noreferrer" target="_blank"&gt;her newsletter&lt;/a&gt;. To read more essays like this, subscribe to &lt;u&gt;&lt;a href="https://every.to/subscribe" rel="noopener noreferrer" target="_blank"&gt;Every&lt;/a&gt;&lt;/u&gt;, and follow us on X at &lt;u&gt;&lt;a href="http://twitter.com/every" rel="noopener noreferrer" target="_blank"&gt;@every&lt;/a&gt;&lt;/u&gt; and on &lt;u&gt;&lt;a href="https://www.linkedin.com/company/everyinc/" rel="noopener noreferrer" target="_blank"&gt;LinkedIn&lt;/a&gt;&lt;/u&gt;.&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;We &lt;u&gt;&lt;a href="https://every.to/studio" rel="noopener noreferrer" target="_blank"&gt;build AI tools&lt;/a&gt;&lt;/u&gt; for readers like you. Write brilliantly with &lt;u&gt;&lt;a href="https://writewithspiral.com/" rel="noopener noreferrer" target="_blank"&gt;Spiral&lt;/a&gt;&lt;/u&gt;. Organize files automatically with &lt;/em&gt;&lt;strong&gt;&lt;em&gt;&lt;u&gt;&lt;a href="https://makeitsparkle.co/?utm_source=everyfooter" rel="noopener noreferrer" target="_blank"&gt;Sparkle&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;&lt;/strong&gt;&lt;em&gt;. Deliver yourself from email with &lt;u&gt;&lt;a href="https://cora.computer/" rel="noopener noreferrer" target="_blank"&gt;Cora&lt;/a&gt;&lt;/u&gt;. Dictate effortlessly with &lt;/em&gt;&lt;strong&gt;&lt;em&gt;&lt;u&gt;&lt;a href="https://monologue.to/" rel="noopener noreferrer" target="_blank"&gt;Monologue&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;&lt;/strong&gt;&lt;em&gt;. Collaborate with agents on documents with &lt;/em&gt;&lt;strong&gt;&lt;em&gt;&lt;a href="https://www.proofeditor.ai/" rel="noopener noreferrer" target="_blank"&gt;Proof&lt;/a&gt;&lt;/em&gt;&lt;/strong&gt;&lt;em&gt;. &lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;We also do AI training, adoption, and innovation for companies. &lt;u&gt;&lt;a href="https://every.to/consulting?utm_source=emailfooter" rel="noopener noreferrer" target="_blank"&gt;Work with us&lt;/a&gt;&lt;/u&gt; to bring AI into your organization.&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;For sponsorship opportunities, reach out to sponsorships@every.to.&lt;/em&gt;&lt;/p&gt;&lt;div class="quill-button" data-source="{&amp;quot;id&amp;quot;:&amp;quot;quill-button-1769187301610&amp;quot;,&amp;quot;url&amp;quot;:&amp;quot;https://every.to/subscribe?source=post_button&amp;quot;,&amp;quot;text&amp;quot;:&amp;quot;Subscribe&amp;quot;}" id="quill-button-1769187301610"&gt;&lt;a href="https://every.to/subscribe?source=post_button"&gt;Subscribe&lt;/a&gt;&lt;/div&gt;</description>
      <author>Dan Shipper, Marcus Moretti, and Katie Parrott / Chain of Thought</author>
      <pubDate>2026-05-07 12:00:00 -0400</pubDate>
      <guid>https://every.to/chain-of-thought/inside-anthropic-s-2026-developer-conference</guid>
      <link>https://every.to/chain-of-thought/inside-anthropic-s-2026-developer-conference</link>
    </item>
    <item>
      <title>OpenAI Flips the Script </title>
      <description>&lt;table&gt;&lt;tr&gt;&lt;td&gt;&lt;img alt="Context Window" src="https://d24ovhgu8s7341.cloudfront.net/uploads/publication/logo/94/small_context_windown_1.png" /&gt;&lt;/td&gt;&lt;td&gt;&lt;/td&gt;&lt;td&gt;&lt;table&gt;&lt;tr&gt;&lt;td&gt;by &lt;a href="https://every.to/@laura_27bbaf_1" itemprop="name"&gt;Laura Entis&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;in &lt;a href="https://every.to/context-window"&gt;Context Window&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;figure&gt;&lt;img src="https://d24ovhgu8s7341.cloudfront.net/uploads/post/cover/4156/full_page_cover_e683df76415d802f-OpenAI_flips_the_script_1.png"&gt;&lt;figcaption&gt;Midjourney/Every illustration.&lt;/figcaption&gt;&lt;/figure&gt;&lt;p&gt;There’s no resting on your laurels in the AI race: OpenAI’s Codex went from trailing Anthropic’s Claude Code to pulling ahead in functionality, at least for now, in a matter of months. Today, Every CEO &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@danshipper" rel="noopener noreferrer" target="_blank"&gt;Dan Shipper&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; explains why OpenAI’s coding app has become his daily driver for work, head of growth &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@tedescau" rel="noopener noreferrer" target="_blank"&gt;Austin Tedesco&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; shares his no-nonsense advice for switching over from Claude Code, and &lt;strong&gt;&lt;u&gt;&lt;a href="https://writewithspiral.com/" rel="noopener noreferrer" target="_blank"&gt;Spiral&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; general manager &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@marcus_fd8302_1" rel="noopener noreferrer" target="_blank"&gt;Marcus Moretti&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; argues it’s OK—good, even—to let some AI trends pass you by. &lt;/p&gt;&lt;p&gt;&lt;em&gt;Was this newsletter forwarded to you? &lt;u&gt;&lt;a href="https://every.to/account" rel="noopener noreferrer" target="_blank"&gt;Sign up&lt;/a&gt;&lt;/u&gt; to get it in your inbox.&lt;/em&gt;&lt;/p&gt;&lt;div class="quill-button" data-source="{&amp;quot;id&amp;quot;:&amp;quot;quill-button-1769530239147&amp;quot;,&amp;quot;url&amp;quot;:&amp;quot;https://every.to/subscribe?source=post_button&amp;quot;,&amp;quot;text&amp;quot;:&amp;quot;Subscribe&amp;quot;}" id="quill-button-1769530239147"&gt;&lt;a href="https://every.to/subscribe?source=post_button"&gt;Subscribe&lt;/a&gt;&lt;/div&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;h2&gt;‘AI &amp;amp; I’: Why we switched from Claude Code to Codex &lt;/h2&gt;&lt;h4&gt;Codex takes the lead&lt;/h4&gt;&lt;p&gt;If you’re looking for evidence of AI’s unrelenting pace, here it is: In January, Dan &lt;u&gt;&lt;a href="https://every.to/chain-of-thought/openai-has-some-catching-up-to-do" rel="noopener noreferrer" target="_blank"&gt;wrote&lt;/a&gt;&lt;/u&gt; that whoever wins vibe coding wins how you work on your computer—and that OpenAI had some serious catching up to do.&lt;/p&gt;&lt;p&gt;Three months and the release of OpenAI’s &lt;u&gt;&lt;a href="https://every.to/vibe-check/gpt-5-5" rel="noopener noreferrer" target="_blank"&gt;latest model&lt;/a&gt;&lt;/u&gt; later, Codex is there, and in a new episode of&lt;em&gt; &lt;u&gt;&lt;a href="https://every.to/podcast" rel="noopener noreferrer" target="_blank"&gt;AI &amp;amp; I&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;, Dan and Austin get into why they do much of their knowledge work in Codex now. They cite the power of &lt;u&gt;&lt;a href="https://every.to/vibe-check/gpt-5-5" rel="noopener noreferrer" target="_blank"&gt;GPT-5.5&lt;/a&gt;&lt;/u&gt;, paired with a desktop app that is faster and more powerful than Claude Desktop or Cowork. &lt;/p&gt;&lt;p&gt;Watch on &lt;a href="https://x.com/danshipper/status/2052054077656252512" rel="noopener noreferrer" target="_blank"&gt;X&lt;/a&gt; or &lt;a href="https://youtu.be/x9BNBcP_C7Q" rel="noopener noreferrer" target="_blank"&gt;YouTube&lt;/a&gt;, or listen on &lt;a href="https://open.spotify.com/episode/2HuoYt9ZV6CzY6foHL1vJe?si=98cb3DpLR266jg06bR2SXg" rel="noopener noreferrer" target="_blank"&gt;Spotify&lt;/a&gt; or &lt;a href="https://podcasts.apple.com/us/podcast/why-we-switched-from-claude-code-to-codex/id1719789201?i=1000766460229" rel="noopener noreferrer" target="_blank"&gt;Apple Podcasts&lt;/a&gt;. You can also read &lt;a href="https://every.to/podcast/transcript-why-we-switched-from-claude-code-to-codex" rel="noopener noreferrer" target="_blank"&gt;the transcript&lt;/a&gt;.&lt;/p&gt;&lt;div class="quill-youtube" id="undefined" data-source="{&amp;quot;url&amp;quot;:&amp;quot;https://youtu.be/x9BNBcP_C7Q&amp;quot;,&amp;quot;height&amp;quot;:&amp;quot;400&amp;quot;,&amp;quot;youtube_id&amp;quot;:&amp;quot;x9BNBcP_C7Q&amp;quot;}" data-height="400" data-youtube-id="x9BNBcP_C7Q" style="max-height: 400px; overflow: hidden;"&gt;&lt;a href="https://youtu.be/x9BNBcP_C7Q" target="_blank" rel="noopener noreferrer"&gt;&lt;img src="https://img.youtube.com/vi/x9BNBcP_C7Q/maxresdefault.jpg" style="width: 100%; aspect-ratio: 16 / 9; display: block;"&gt;&lt;div class="play"&gt;&lt;img src="https://d24ovhgu8s7341.cloudfront.net/static/emails/youtube-logo.png"&gt;&lt;/div&gt;&lt;/a&gt;&lt;/div&gt;&lt;p&gt;Here are a couple of Dan and Austin’s favorite current use cases for Codex: &lt;/p&gt;&lt;ul&gt;&lt;li&gt;&lt;strong&gt;Austin uses Codex for strategy docs.&lt;/strong&gt; Austin needed to write a go-to-market plan for a new Every product but kept getting pulled away by other work. So he pointed Codex at the team’s Notion meeting notes, Slack threads, and his preferred template and told it to pull together content where they’d discussed strategy and transform it into an action plan. What came back was 80 to 90 percent of the way there.&lt;/li&gt;&lt;li&gt;&lt;strong&gt;Dan uses Codex for recruiting.&lt;/strong&gt; When he is &lt;u&gt;&lt;a href="https://every.to/careers#open-roles" rel="noopener noreferrer" target="_blank"&gt;recruiting&lt;/a&gt;&lt;/u&gt; people to work at Every, Dan starts with a sense of where strong candidates might have learned the skills Every needs, instead of looking for a specific job title. He then asks Codex to find people who match that career arc—for example, to find someone to &lt;u&gt;&lt;a href="https://modern-ton-234.notion.site/1ffca4f355ac8361a0948106d4dc1bed?pvs=105" rel="noopener noreferrer" target="_blank"&gt;help scale Every’s courses&lt;/a&gt;&lt;/u&gt;, he looked for candidates who had worked at education startup General Assembly before transitioning into AI. &lt;/li&gt;&lt;/ul&gt;&lt;p&gt;Miss an episode? Catch up on Dan’s recent conversations with LinkedIn cofounder &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/" rel="noopener noreferrer" target="_blank"&gt;Reid Hoffman&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;; the team that built Claude Code, &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/" rel="noopener noreferrer" target="_blank"&gt;Cat Wu&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; and &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/" rel="noopener noreferrer" target="_blank"&gt;Boris Cherny&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;; Vercel cofounder &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/" rel="noopener noreferrer" target="_blank"&gt;Guillermo Rauch&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;; podcaster &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/" rel="noopener noreferrer" target="_blank"&gt;Dwarkesh Patel&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;; and others, and learn how they use AI to think, create, and relate.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;h2&gt;Migration anxiety&lt;/h2&gt;&lt;h4&gt;Claude Code-to-Codex &lt;/h4&gt;&lt;p&gt;If you want to switch to Codex or any other coding app, how should you think about migrating? When your setup includes app-specific &lt;u&gt;&lt;a href="https://every.to/p/the-agent-that-saved-my-brain" rel="noopener noreferrer" target="_blank"&gt;project folders&lt;/a&gt;&lt;/u&gt;, skills, plugins, or integrations, it can be daunting.  &lt;/p&gt;&lt;p&gt;Austin’s migration from Claude Code to Codex was disarmingly simple: He opened his Every work project in Codex, told it he typically worked in Claude Code, asked it to inspect the folder, and told it to update anything that should work differently in Codex.&lt;/p&gt;&lt;p&gt;When Codex got something wrong, he handled it in the moment and told it, “This doesn’t look great. Can you fix it?” And it did. &lt;/p&gt;&lt;p&gt;Before GPT-5.5, staff writer &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@katie.parrott12" rel="noopener noreferrer" target="_blank"&gt;Katie Parrott&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; hadn’t used ChatGPT for writing in almost a year.&lt;/p&gt;&lt;p&gt;Now, she splits her writing sessions between Claude Code and Codex. She moved over by giving Codex the writing and editing skills she had already saved as Markdown files on her computer and asking it to adapt them for its own environment.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;h2&gt;Steal this workflow&lt;/h2&gt;&lt;h4&gt;Join the early majority&lt;/h4&gt;&lt;p&gt;&lt;u&gt;&lt;a href="https://writewithspiral.com/" rel="noopener noreferrer" target="_blank"&gt;Spiral&lt;/a&gt;&lt;/u&gt; general manager Marcus is OK with letting most AI hype—managing a swarm of OpenClaws each running on its own Mac Mini, for example—pass him by. Earlier in his career, he was an early adopter of new tools and technology trends, but these days, he finds himself closer to the early majority section of the adoption curve. As the one-man team behind Every’s AI writing product, &lt;u&gt;&lt;a href="https://every.to/source-code/claude-code-for-product-managers" rel="noopener noreferrer" target="_blank"&gt;he has a lot to do&lt;/a&gt;&lt;/u&gt;—if he’s going to add something new to his workflow, it has to clear a high bar.&lt;/p&gt;&lt;div class="quill-block-image" id="quill-block-image-1778079872018-bv7vai2n0" data-source="{&amp;quot;dom_id&amp;quot;:&amp;quot;quill-block-image-1778079872018-bv7vai2n0&amp;quot;,&amp;quot;link&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4156/optimized_5c354b57-f6e3-4679-a2c2-8afacec48075.png&amp;quot;,&amp;quot;image&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4156/optimized_5c354b57-f6e3-4679-a2c2-8afacec48075.png&amp;quot;,&amp;quot;caption&amp;quot;:&amp;quot;Marcus is comfortable being among the 34 percent of the population who are slightly early to adopting a new technology. (Image, which is based on Everett Rogers’ Diffusion of Innovations framework, courtesy of Laura Entis.)&amp;quot;,&amp;quot;error&amp;quot;:null}"&gt;&lt;div&gt;&lt;a href="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4156/optimized_5c354b57-f6e3-4679-a2c2-8afacec48075.png" target="_blank" rel="noopener noreferrer"&gt;&lt;img src="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4156/optimized_5c354b57-f6e3-4679-a2c2-8afacec48075.png" alt="Marcus is comfortable being among the 34 percent of the population who are slightly early to adopting a new technology. (Image, which is based on Everett Rogers’ Diffusion of Innovations framework, courtesy of Laura Entis.)"&gt;&lt;/a&gt;&lt;figcaption class="quill-image-caption"&gt;Marcus is comfortable being among the 34 percent of the population who are slightly early to adopting a new technology. (Image, which is based on Everett Rogers’ Diffusion of Innovations framework, courtesy of Laura Entis.)&lt;/figcaption&gt;&lt;/div&gt;&lt;/div&gt;&lt;p&gt;&lt;br&gt;&lt;/p&gt;&lt;p&gt;Here’s Marcus’s strategy for determining what’s worth testing.&lt;/p&gt;&lt;ol&gt;&lt;li&gt;&lt;strong&gt;Start with a real problem.&lt;/strong&gt; A useful filter is to focus only on tools or services that solve an existing issue. For example, Marcus decided to test out Stripe’s token-based billing feature—which allows you to measure how much users cost you in tokens—because of a genuine challenge he was facing: Spiral needed a better way to track AI usage costs across models.&lt;/li&gt;&lt;li&gt;&lt;strong&gt;Don’t fall for productivity theater.&lt;/strong&gt; Marcus ignores demos that brag about how many machines or agents someone is running simultaneously. He doesn’t care about what the setup looks like; what matters is whether it will make his life better.&lt;/li&gt;&lt;li&gt;&lt;strong&gt;Sit back and see what pans out.&lt;/strong&gt; Marcus generally waits to try a product until there’s evidence that companies he respects are using it in production, even by checking for logos on a tool’s homepage showing which brands are using it. Even better if the product is from a company he already knows and trusts, like Stripe or Anthropic. With the Stripe use-based billing example, the calculus was simple: “Great company solving a real problem I have—I’ll try it,” he says.&lt;/li&gt;&lt;/ol&gt;&lt;p&gt;&lt;strong&gt;Test it out for yourself:&lt;/strong&gt; &lt;/p&gt;&lt;p&gt;Pick one AI tool you feel vaguely guilty for not trying and write one sentence: “Before this tool, I _____. After this tool, I can _____.” If you cannot fill in both blanks, let yourself off the hook.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;h2&gt;Alignment&lt;/h2&gt;&lt;h4&gt;Every’s COO Brandon Gell on cultivating curiosity in an AI world&lt;/h4&gt;&lt;p&gt;My son was born eight months ago. Since then, I’ve asked myself regularly: How can I teach him to lead a fulfilling life, especially when it comes to technology?&lt;/p&gt;&lt;p&gt;I’m a computer native, born in 1994, the year Netscape was first released. My son was born in 2025, the year Claude Code was invented. The world I grew up in rewarded people with the fortitude to find answers. The world he’s growing up in has made that table stakes. So if the answers aren’t scarce anymore, what is?&lt;/p&gt;&lt;p&gt;Curiosity. Knowing what to ask next—having the instinct to push further, to connect unexpected dots, to wonder about something nobody else paid attention to—is what’s scarce.&lt;/p&gt;&lt;p&gt;It’s also distinctly human. It causes us to make connections between unrelated ideas and connect dots that don’t follow obvious patterns. It brings our personal values and lived experiences into what we explore, shaping not only what we discover but why it matters. It pulls us toward questions we find fascinating—not because they’re useful, but because we can’t stop wondering.&lt;/p&gt;&lt;p&gt;AI can’t replicate that. Curiosity requires perspective and &lt;u&gt;&lt;a href="https://every.to/p/what-is-taste-really" rel="noopener noreferrer" target="_blank"&gt;taste&lt;/a&gt;&lt;/u&gt;, things that are difficult to instill in a model. And even if you could, it would never be as diverse as the perspectives of 8 billion humans, each one shaped by a different life.&lt;/p&gt;&lt;p&gt;I want my son to be insatiably curious, and I’ve realized that to instill that in him, I need to cultivate it in myself. Which means developing it and maintaining it, like a muscle. Here’s what that looks like:&lt;/p&gt;&lt;h5&gt;&lt;strong&gt;Lesson 1: Use AI to go deeper on something you already care about&lt;/strong&gt;&lt;/h5&gt;&lt;p&gt;After I sold my &lt;u&gt;&lt;a href="https://every.to/on-every/brandon-gell-joins-every-as-our-first-entrepreneur-in-residence" rel="noopener noreferrer" target="_blank"&gt;insurance company, Clyde&lt;/a&gt;&lt;/u&gt;, I realized how disconnected I had become from my creativity outside of work. The same curiosity that drove me to explore the idea that had become my company had gone dormant as I focused singularly on its success. I realized just how lost I was while driving and listening to music. I could hear the music, but I could no longer &lt;em&gt;feel&lt;/em&gt; it. &lt;/p&gt;&lt;p&gt;Not long after this drive, my friend Mike showed me some speakers he had built. I realized in order to truly hear the music, to find my curiosity, I had to build a pair of speakers and a subwoofer. The project would combine my interest in architecture, experience with woodworking, and total lack of knowledge in audio engineering. &lt;/p&gt;&lt;p&gt;Next thing I knew, I was hours deep into a ChatGPT conversation about sound waves and acoustic design, learning how. &lt;/p&gt;&lt;h5&gt;&lt;strong&gt;Lesson 2: Use AI to build something you wouldn’t otherwise make&lt;/strong&gt;&lt;/h5&gt;&lt;p&gt;For the past 15 years, I’ve on and off tried lucid dreaming. So when I saw the &lt;u&gt;&lt;a href="https://github.com/modem-works/dream-recorder" rel="noopener noreferrer" target="_blank"&gt;Dream Recorder GitHub repository&lt;/a&gt;&lt;/u&gt;, an open-source project that uses video AI models to visualize your dreams &lt;u&gt;&lt;a href="https://modemworks.com/projects/dreamrecorder/" rel="noopener noreferrer" target="_blank"&gt;as cinematic reels on a bedside device&lt;/a&gt;&lt;/u&gt;, I knew I wanted to make one for myself. The problem? I’d never built any hardware, didn’t have a 3D printer, and calling myself a front-end developer would be generous. So I used AI to help me adapt the open-source repository and build something I’d never otherwise be able to make. I bought a 3D printer, improved the original code, and spent many long nights perfecting my dream recorder.&lt;/p&gt;&lt;p&gt;I still don’t know how to code. But that doesn’t matter. In both situations, I used AI to leapfrog the unknown and explore my curiosity and my dreams. AI was a learning partner, not an answering machine. It taught me the things I don’t know, and I combined that with the skills I already had to build something new.&lt;/p&gt;&lt;h5&gt;&lt;strong&gt;What this means for all of us&lt;/strong&gt;&lt;/h5&gt;&lt;p&gt;In a world where the “right” answer is one AI prompt away, we need to stop rewarding our kids and our students for getting the answer right and start rewarding them for the quality of their questions, the depth of their curiosity, and their resilience to ask the next question when in uncharted territory. Curiosity is what separates the people who use AI as a crutch from the people who use it as a rocket. &lt;/p&gt;&lt;p&gt;In a world where there’s always an answer, let the next question be your guide.—&lt;em&gt;&lt;u&gt;&lt;a href="https://every.to/@brandon_5263" rel="noopener noreferrer" target="_blank"&gt;Brandon Gell&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;strong&gt;&lt;em&gt;&lt;a href="https://every.to/@laura_27bbaf_1" rel="noopener noreferrer" target="_blank"&gt;Laura Entis&lt;/a&gt;&lt;/em&gt;&lt;/strong&gt;&lt;em&gt; is a staff writer at Every. You can follow her on &lt;a href="https://www.linkedin.com/in/lauraentis/" rel="noopener noreferrer" target="_blank"&gt;LinkedIn&lt;/a&gt;. &lt;/em&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;To read more essays like this, subscribe to &lt;u&gt;&lt;a href="https://every.to/subscribe" rel="noopener noreferrer" target="_blank"&gt;Every&lt;/a&gt;&lt;/u&gt;, and follow us on X at &lt;u&gt;&lt;a href="http://twitter.com/every" rel="noopener noreferrer" target="_blank"&gt;@every&lt;/a&gt;&lt;/u&gt; and on &lt;u&gt;&lt;a href="https://www.linkedin.com/company/everyinc/" rel="noopener noreferrer" target="_blank"&gt;LinkedIn&lt;/a&gt;&lt;/u&gt;. For sponsorship opportunities, reach out to sponsorships@every.to.&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;Help us scale the only subscription you need to stay at the edge of AI. Explore &lt;u&gt;&lt;a href="https://www.notion.so/Jobs-Every-25cca4f355ac80c5ad6ee7a6e93d6b4e?pvs=21" rel="noopener noreferrer" target="_blank"&gt;open roles at Every&lt;/a&gt;&lt;/u&gt;.&lt;/em&gt;&lt;/p&gt;&lt;div class="quill-button" data-source="{&amp;quot;id&amp;quot;:&amp;quot;quill-button-1769187301610&amp;quot;,&amp;quot;url&amp;quot;:&amp;quot;https://every.to/subscribe?source=post_button&amp;quot;,&amp;quot;text&amp;quot;:&amp;quot;Subscribe&amp;quot;}" id="quill-button-1769187301610"&gt;&lt;a href="https://every.to/subscribe?source=post_button"&gt;Subscribe&lt;/a&gt;&lt;/div&gt;</description>
      <author>Laura Entis / Context Window</author>
      <pubDate>2026-05-06 08:00:00 -0400</pubDate>
      <guid>https://every.to/context-window/openai-flips-the-script</guid>
      <link>https://every.to/context-window/openai-flips-the-script</link>
    </item>
    <item>
      <title>The Dawn of Codex-native Apps</title>
      <description>&lt;table&gt;&lt;tr&gt;&lt;td&gt;&lt;img alt="Context Window" src="https://d24ovhgu8s7341.cloudfront.net/uploads/publication/logo/94/small_context_windown_1.png" /&gt;&lt;/td&gt;&lt;td&gt;&lt;/td&gt;&lt;td&gt;&lt;table&gt;&lt;tr&gt;&lt;td&gt;by &lt;a href="https://every.to/@katie.parrott12" itemprop="name"&gt;Katie Parrott&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;in &lt;a href="https://every.to/context-window"&gt;Context Window&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;figure&gt;&lt;img src="https://d24ovhgu8s7341.cloudfront.net/uploads/post/cover/4154/full_page_cover_6946cfab923a7c5d-CW_Image.png"&gt;&lt;figcaption&gt;Midjourney/Every illustration.&lt;/figcaption&gt;&lt;/figure&gt;&lt;h2&gt;Inside Every&lt;/h2&gt;&lt;p&gt;Working with AI right now often means making the same judgment call dozens of times a day: Hand this task off to an agent or stay close to the process? “The landscape of working with AI is bifurcating,” is how CEO &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@danshipper" rel="noopener noreferrer" target="_blank"&gt;Dan Shipper&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; put it in Every’s Monday standup. On one side is the agent you delegate to. On the other is the agent that sits beside you while you write, code, triage, revise, and decide.&lt;/p&gt;&lt;p&gt;Watching the Every team work, you can’t unsee it. Dan delegates bug reports for our collaborative document editor, &lt;strong&gt;&lt;u&gt;&lt;a href="http://proofeditor.ai" rel="noopener noreferrer" target="_blank"&gt;Proof&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;, to his OpenClaw agent, R2-C2&lt;strong&gt;. &lt;/strong&gt;But he stays close to his inbox through a combination of &lt;u&gt;&lt;a href="https://every.to/context-window/one-app-to-rule-all-knowledge-work" rel="noopener noreferrer" target="_blank"&gt;Codex&lt;/a&gt;&lt;/u&gt;&lt;a href="https://every.to/@kieran_1355" rel="noopener noreferrer" target="_blank"&gt;,&lt;/a&gt; Every’s AI email assistant &lt;strong&gt;&lt;u&gt;&lt;a href="http://cora.computer" rel="noopener noreferrer" target="_blank"&gt;Cora&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;&lt;a href="https://every.to/@kieran_1355" rel="noopener noreferrer" target="_blank"&gt;, and a document with custom rules (steal his workflow below&lt;/a&gt;). &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@kieran_1355" rel="noopener noreferrer" target="_blank"&gt;Kieran Klaassen&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; hands the middle of his &lt;u&gt;&lt;a href="https://every.to/guides/compound-engineering" rel="noopener noreferrer" target="_blank"&gt;compound engineering&lt;/a&gt;&lt;/u&gt; workflow to the model but works closely with it to brainstorm at the beginning and polish at the end. I (&lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@katie.parrott12" rel="noopener noreferrer" target="_blank"&gt;Katie Parrott&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;) send the model off to do research, but I’d never trust it to execute a full draft without my hands firmly on the wheel.&lt;/p&gt;&lt;p&gt;Which means the &lt;u&gt;&lt;a href="https://every.to/chain-of-thought/the-knowledge-economy-is-over-welcome-to-the-allocation-economy" rel="noopener noreferrer" target="_blank"&gt;allocation economy&lt;/a&gt;&lt;/u&gt; thesis was only right about half the work. Some of it still wants delegation, but the other half wants you to stay close, pairing on every move with the model in the same window. The two halves demand different skills, and the meta-skill is knowing which is which.&lt;/p&gt;&lt;p&gt;Think of it as the AI version of the &lt;u&gt;&lt;a href="https://en.wikipedia.org/wiki/Serenity_Prayer" rel="noopener noreferrer" target="_blank"&gt;serenity prayer&lt;/a&gt;&lt;/u&gt;: Grant me the serenity to delegate the work I can, the expertise to sit with the model on the work I can’t, and the wisdom to know the difference.&lt;/p&gt;&lt;p&gt;&lt;em&gt;Was this newsletter forwarded to you? &lt;u&gt;&lt;a href="https://every.to/account" rel="noopener noreferrer" target="_blank"&gt;Sign up&lt;/a&gt;&lt;/u&gt; to get it in your inbox.&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;h2&gt;Steal this workflow&lt;/h2&gt;&lt;h4&gt;Get to inbox zero with Codex &lt;/h4&gt;&lt;p&gt;The perfect email workflow is the white whale productivity people have chased for a decade, Dan included. His latest AI-native version puts the agent in the inbox and the human in a shared document, where every draft and decision stays visible. Here’s how he does it:&lt;/p&gt;&lt;p&gt;1. &lt;strong&gt;Write a one-page operating manual for your inbox. &lt;/strong&gt;The document, which Dan keeps in Proof, names his VIPs, describes what to auto-archive, summarize, or draft, and explains how to handle scheduling.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;2. Open your agent-native email tool in Codex.&lt;/strong&gt; In Codex’s browser pane, Dan loads Cora, which gives the agent two ways to act: &lt;u&gt;&lt;a href="https://cora.computer/api_tokens" rel="noopener noreferrer" target="_blank"&gt;command line instructions&lt;/a&gt;&lt;/u&gt; to archive threads—but also the ability to click through the inbox like a person.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;3. Work from a document instead of your email. &lt;/strong&gt;Dan has Codex create a separate Proof document for each inbox run. Codex sweeps the inbox, archives what the operating manual says to archive, and adds every draft or decision to the bottom of the document. Dan replies inline: “Spam,” “archive,” “reply just to Willie asking what he wants to do here,” “send the invite, draft a reply to Tony.” Codex picks up each instruction, drafts in Cora simultaneously as Dan moves onto the next message, and waits for approval before sending.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Try it this week: &lt;/strong&gt;Write a one-page “how to do my email” document with your own VIPs, auto-archive rules, scheduling preferences, and reply style. Then open Codex, load your email client in its browser pane, and paste in your instruction document and this prompt:&lt;/p&gt;&lt;blockquote&gt;“Sweep my inbox using this operating manual. Put every draft and decision in this doc and wait for me before sending anything.”&lt;/blockquote&gt;&lt;div class="quill-block-image" id="quill-block-image-1777995093809-rrp68bkmn" data-source="{&amp;quot;dom_id&amp;quot;:&amp;quot;quill-block-image-1777995093809-rrp68bkmn&amp;quot;,&amp;quot;link&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4154/optimized_944e070e-dc06-4ece-952c-80b8094bfc3c.png&amp;quot;,&amp;quot;image&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4154/optimized_944e070e-dc06-4ece-952c-80b8094bfc3c.png&amp;quot;,&amp;quot;caption&amp;quot;:&amp;quot;Dan’s email workflow as set up in Codex: chat on the left, web browser with Cora on the right. In this version, Dan has also vibe coded a one-page interface that plugs into Cora’s CLI. (Image courtesy of Dan Shipper.)&amp;quot;,&amp;quot;error&amp;quot;:null}"&gt;&lt;div&gt;&lt;a href="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4154/optimized_944e070e-dc06-4ece-952c-80b8094bfc3c.png" target="_blank" rel="noopener noreferrer"&gt;&lt;img src="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4154/optimized_944e070e-dc06-4ece-952c-80b8094bfc3c.png" alt="Dan’s email workflow as set up in Codex: chat on the left, web browser with Cora on the right. In this version, Dan has also vibe coded a one-page interface that plugs into Cora’s CLI. (Image courtesy of Dan Shipper.)"&gt;&lt;/a&gt;&lt;figcaption class="quill-image-caption"&gt;Dan’s email workflow as set up in Codex: chat on the left, web browser with Cora on the right. In this version, Dan has also vibe coded a one-page interface that plugs into Cora’s CLI. (Image courtesy of Dan Shipper.)&lt;/figcaption&gt;&lt;/div&gt;&lt;/div&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;h2&gt;New job alert&lt;/h2&gt;&lt;p&gt;If the new meta-skill is knowing when to delegate and when to stay close, here it is in job-description form: Airtable is hiring an &lt;u&gt;&lt;a href="https://job-boards.greenhouse.io/airtable/jobs/8409168002" rel="noopener noreferrer" target="_blank"&gt;AI Agent Architect, Customer Experience&lt;/a&gt;&lt;/u&gt;. &lt;/p&gt;&lt;p&gt;Support software used to route tickets and surface help center articles. Now it can read context, act across tools, and decide what to do. Which means someone has to &lt;u&gt;&lt;a href="https://every.to/thesis/how-to-design-for-human-agent-interaction" rel="noopener noreferrer" target="_blank"&gt;design the boundary&lt;/a&gt;&lt;/u&gt; around support agents—what knowledge they retrieve, which APIs they can use, when they can modify an account, how failures get measured, and where the agent hands the work back to a person.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;h2&gt;Tool for thought&lt;/h2&gt;&lt;h4&gt;Musk’s five rules of automation, except for agents&lt;/h4&gt;&lt;p&gt;In 2021, Elon Musk introduced his “algorithm,” a &lt;u&gt;&lt;a href="https://www.inc.com/jeff-haden/elon-musks-algorithm-a-5-step-process-to-dramatically-improve-nearly-everything-is-both-simple-brilliant.html" rel="noopener noreferrer" target="_blank"&gt;five-step rubric&lt;/a&gt;&lt;/u&gt; he uses at Tesla and SpaceX to figure out what a process needs before trying to make it faster or handing off any part of it to a machine. &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@williewilliams" rel="noopener noreferrer" target="_blank"&gt;Willie Williams&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;, Every’s head of platform, has been exploring how it might apply to agent workflows: &lt;/p&gt;&lt;ol&gt;&lt;li&gt;&lt;strong&gt;Question every requirement. &lt;/strong&gt;Every rule, checkpoint, and instruction in a workflow has to justify itself by naming the specific thing that goes wrong without it. If nobody can answer that, it shouldn’t be there.&lt;/li&gt;&lt;li&gt;&lt;strong&gt;Delete what you can. &lt;/strong&gt;Cut steps, approvals, reviews, and agents that don’t survive step one. If you’re not occasionally removing something you later need to restore, you haven’t cut enough.&lt;/li&gt;&lt;li&gt;&lt;strong&gt;Simplify and clarify. &lt;/strong&gt;Break the remaining work into smaller, clearer pieces. Each task should have a single owner, a defined output, and only the information and tools it actually needs.&lt;/li&gt;&lt;li&gt;&lt;strong&gt;Accelerate feedback loops. &lt;/strong&gt;Shorten the time between handing work to an agent and knowing whether it succeeded. Surface errors early, run independent tasks at the same time, and stop making the workflow wait on unneeded approvals.&lt;/li&gt;&lt;li&gt;&lt;strong&gt;Automate last. &lt;/strong&gt;Start with a checkpoint at every step. Only after a workflow is necessary, lean, and fast should you take the humans out of the loop.&lt;/li&gt;&lt;/ol&gt;&lt;p&gt;Still, Musk’s algorithm was intended for factories building electric cars, rockets, and satellites—hardware. They don’t directly translate to AI agents. “These rules &lt;em&gt;should&lt;/em&gt; apply to the world of software automation,” says Willie, “but we don’t actually have them yet. And we have to work on finding them.”&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;h2&gt;Model card&lt;/h2&gt;&lt;div class="quill-block-image" id="quill-block-image-1777997529980-ils81bwrg" data-source="{&amp;quot;dom_id&amp;quot;:&amp;quot;quill-block-image-1777997529980-ils81bwrg&amp;quot;,&amp;quot;link&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4154/optimized_31a2d516-7e27-4980-b1c6-ae90d8ca6ed2.png&amp;quot;,&amp;quot;image&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4154/optimized_31a2d516-7e27-4980-b1c6-ae90d8ca6ed2.png&amp;quot;,&amp;quot;caption&amp;quot;:&amp;quot;ChatGPT/Every illustration.&amp;quot;,&amp;quot;error&amp;quot;:null}"&gt;&lt;div&gt;&lt;a href="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4154/optimized_31a2d516-7e27-4980-b1c6-ae90d8ca6ed2.png" target="_blank" rel="noopener noreferrer"&gt;&lt;img src="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4154/optimized_31a2d516-7e27-4980-b1c6-ae90d8ca6ed2.png" alt="ChatGPT/Every illustration."&gt;&lt;/a&gt;&lt;figcaption class="quill-image-caption"&gt;ChatGPT/Every illustration.&lt;/figcaption&gt;&lt;/div&gt;&lt;/div&gt;&lt;h2&gt;&lt;hr class="quill-line"&gt;&lt;/h2&gt;&lt;h2&gt;Signal&lt;/h2&gt;&lt;h4&gt;The hard part isn’t the model&lt;/h4&gt;&lt;p&gt;The bifurcation Dan named in Monday’s standup—delegate to the agent, or sit beside it—is the same problem for which frontier labs are now selling enterprise solutions.&lt;/p&gt;&lt;p&gt;OpenAI made it explicit last month with its new &lt;u&gt;&lt;a href="https://openai.com/index/frontier-alliance-partners/" rel="noopener noreferrer" target="_blank"&gt;Frontier Alliance&lt;/a&gt;&lt;/u&gt; initiative pairing OpenAI engineers with large enterprises to deploy agents inside their workflows. “The limiting factor for seeing value from AI in enterprises isn’t model intelligence,” writes OpenAI. “It’s how agents are built and run in their organizations.”&lt;/p&gt;&lt;p&gt;Then this week, &lt;u&gt;&lt;a href="https://www.anthropic.com/news/enterprise-ai-services-company" rel="noopener noreferrer" target="_blank"&gt;Anthropic announced a parallel move&lt;/a&gt;&lt;/u&gt;—a new services firm with Blackstone, private equity firm Hellman &amp;amp; Friedman, and Goldman Sachs to help companies “design, build, and maintain” Claude deployments.&lt;/p&gt;&lt;p&gt;Both labs are saying the quiet part out loud: The hard part of deploying and working with agents is everything around the models themselves—the context, permissions, handoffs, evaluations, and human relationships that decide whether a model should run ahead or sit beside you. Dan’s inbox workflow and Airtable’s support-agent job are microcosms of the same problem, now landing on the enterprise balance sheet. (Every’s &lt;u&gt;&lt;a href="https://every.to/consulting" rel="noopener noreferrer" target="_blank"&gt;consulting practice&lt;/a&gt;&lt;/u&gt; also helps companies implement AI workflows and products.)&lt;/p&gt;&lt;h5&gt;&lt;strong&gt;What to do this week: &lt;/strong&gt;&lt;/h5&gt;&lt;ul&gt;&lt;li&gt;&lt;strong&gt;Write down how you want the work done before you prompt.&lt;/strong&gt; WhatOpenAI and Anthropic are charging Fortune 500s millions for is the document Dan wrote himself in an afternoon: who counts as a VIP, what to auto-archive, when to escalate. Start there.&lt;/li&gt;&lt;li&gt;&lt;strong&gt;Split your tasks into “hand off” versus “stay close.”&lt;/strong&gt; Bug triage can run on its own. Important email drafts need you in the loop. Sort before you delegate.&lt;/li&gt;&lt;li&gt;&lt;strong&gt;Keep the agent’s actions visible.&lt;/strong&gt; Drafts in a shared document, tracked changes, an action log—whatever the form, you need a record. If you can’t audit the agent’s work and revert it if needed, you aren’t the one driving.&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;&lt;strong&gt;&lt;em&gt;&lt;a href="https://every.to/@katie.parrott12" rel="noopener noreferrer" target="_blank"&gt;Katie Parrott&lt;/a&gt;&lt;/em&gt;&lt;/strong&gt; &lt;em&gt;is a staff writer at Every. You can read more of her work in&lt;/em&gt; &lt;em&gt;&lt;a href="https://katieparrott.substack.com/" rel="noopener noreferrer" target="_blank"&gt;her newsletter&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;To read more essays like this, subscribe to &lt;u&gt;&lt;a href="https://every.to/subscribe" rel="noopener noreferrer" target="_blank"&gt;Every&lt;/a&gt;&lt;/u&gt;, and follow us on X at &lt;u&gt;&lt;a href="http://twitter.com/every" rel="noopener noreferrer" target="_blank"&gt;@every&lt;/a&gt;&lt;/u&gt; and on &lt;u&gt;&lt;a href="https://www.linkedin.com/company/everyinc/" rel="noopener noreferrer" target="_blank"&gt;LinkedIn&lt;/a&gt;&lt;/u&gt;.&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;We &lt;u&gt;&lt;a href="https://every.to/studio" rel="noopener noreferrer" target="_blank"&gt;build AI tools&lt;/a&gt;&lt;/u&gt; for readers like you. Write brilliantly with &lt;/em&gt;&lt;strong&gt;&lt;em&gt;&lt;u&gt;&lt;a href="https://writewithspiral.com/" rel="noopener noreferrer" target="_blank"&gt;Spiral&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;&lt;/strong&gt;&lt;em&gt;. Organize files automatically with &lt;/em&gt;&lt;strong&gt;&lt;em&gt;&lt;u&gt;&lt;a href="https://makeitsparkle.co/?utm_source=everyfooter" rel="noopener noreferrer" target="_blank"&gt;Sparkle&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;&lt;/strong&gt;&lt;em&gt;. Deliver yourself from email with &lt;u&gt;&lt;a href="https://cora.computer" rel="noopener noreferrer" target="_blank"&gt;Cora&lt;/a&gt;&lt;/u&gt;. Dictate effortlessly with &lt;/em&gt;&lt;strong&gt;&lt;em&gt;&lt;u&gt;&lt;a href="https://monologue.to" rel="noopener noreferrer" target="_blank"&gt;Monologue&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;&lt;/strong&gt;&lt;em&gt;. Work on documents with AI agents using &lt;u&gt;&lt;a href="https://www.proofeditor.ai/?source=post_button" rel="noopener noreferrer" target="_blank"&gt;Proof&lt;/a&gt;&lt;/u&gt;.&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;For sponsorship opportunities, reach out to sponsorships@every.to.&lt;/em&gt;&lt;/p&gt;&lt;div class="quill-button" data-source="{&amp;quot;id&amp;quot;:&amp;quot;quill-button-1769187301610&amp;quot;,&amp;quot;url&amp;quot;:&amp;quot;https://every.to/subscribe?source=post_button&amp;quot;,&amp;quot;text&amp;quot;:&amp;quot;Subscribe&amp;quot;}" id="quill-button-1769187301610"&gt;&lt;a href="https://every.to/subscribe?source=post_button"&gt;Subscribe&lt;/a&gt;&lt;/div&gt;</description>
      <author>Katie Parrott / Context Window</author>
      <pubDate>2026-05-05 07:00:00 -0400</pubDate>
      <guid>https://every.to/context-window/the-dawn-of-codex-native-apps</guid>
      <link>https://every.to/context-window/the-dawn-of-codex-native-apps</link>
    </item>
    <item>
      <title>I Let ChatGPT Manage My Workweek</title>
      <description>&lt;table&gt;&lt;tr&gt;&lt;td&gt;&lt;img alt="Working Overtime" src="https://d24ovhgu8s7341.cloudfront.net/uploads/publication/logo/100/small_Screenshot_2024-11-22_at_9.33.36_AM.png" /&gt;&lt;/td&gt;&lt;td&gt;&lt;/td&gt;&lt;td&gt;&lt;table&gt;&lt;tr&gt;&lt;td&gt;by &lt;a href="https://every.to/@katie.parrott12" itemprop="name"&gt;Katie Parrott&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;in &lt;a href="https://every.to/working-overtime"&gt;Working Overtime&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;figure&gt;&lt;img src="https://d24ovhgu8s7341.cloudfront.net/uploads/post/cover/4153/full_page_cover_b8aacc95f337281e-AI_Project_Manager.png"&gt;&lt;figcaption&gt;Midjourney/Every illustration.&lt;/figcaption&gt;&lt;/figure&gt;&lt;p&gt;&lt;em&gt;Was this newsletter forwarded to you? &lt;u&gt;&lt;a href="https://every.to/account" rel="noopener noreferrer" target="_blank"&gt;Sign up&lt;/a&gt;&lt;/u&gt; to get it in your inbox.&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;I sat down to write my second-quarter goals at 4:30 p.m. on a Tuesday in early April. It was the day after I was supposed to turn them in when I decided to be an adult and survey the damage from the first quarter. And I do mean damage. I’d written only half of the columns I’d committed to. Another project I had promised hadn’t even gotten off the ground. &lt;/p&gt;&lt;p&gt;I could give the usual excuses—the quarter was busy, the project hit walls outside my control—but the real culprit was obvious: I may be a great writer, but I am garbage at project management.&lt;/p&gt;&lt;p&gt;For 15 years, I handled this weakness by tiptoeing around it. I didn’t take on managerial roles that would have required more organizational skills. I didn’t take on so much freelance work that I couldn’t keep the deadlines in my head. I passed on ambitious projects—too many moving parts. &lt;/p&gt;&lt;p&gt;This duct-taped approach worked until I decided to join Every full-time in April. If I were going to take on more responsibility as a full member of the team, I needed to get serious about project management. Which, in 2026, meant I needed to bring in AI.  &lt;/p&gt;&lt;p&gt;So I built myself a project manager: a ChatGPT agent that holds my OKRs—&lt;u&gt;&lt;a href="https://every.to/source-code/how-we-run-a-25-person-company-on-four-ai-agents" rel="noopener noreferrer" target="_blank"&gt;objectives and key results&lt;/a&gt;&lt;/u&gt;, the goals that define a successful quarter—watches my calendar, reads my Notion to-do list, and helps me decide what to do next. Otherwise, I’d spend my day opening Slack, refreshing X, panicking lightly, repeat.&lt;/p&gt;&lt;div class="quill-block-image" id="quill-block-image-1777904333883-03a1qi128" data-source="{&amp;quot;dom_id&amp;quot;:&amp;quot;quill-block-image-1777904333883-03a1qi128&amp;quot;,&amp;quot;link&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4153/optimized_67648ca9-87d7-4496-92f7-450891620373.png&amp;quot;,&amp;quot;image&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4153/optimized_67648ca9-87d7-4496-92f7-450891620373.png&amp;quot;,&amp;quot;caption&amp;quot;:&amp;quot;My ChatGPT project management agent helpfully points me toward where to put my focus for a day. (All images courtesy of Katie Parrott.)&amp;quot;,&amp;quot;error&amp;quot;:null}"&gt;&lt;div&gt;&lt;a href="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4153/optimized_67648ca9-87d7-4496-92f7-450891620373.png" target="_blank" rel="noopener noreferrer"&gt;&lt;img src="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4153/optimized_67648ca9-87d7-4496-92f7-450891620373.png" alt="My ChatGPT project management agent helpfully points me toward where to put my focus for a day. (All images courtesy of Katie Parrott.)"&gt;&lt;/a&gt;&lt;figcaption class="quill-image-caption"&gt;My ChatGPT project management agent helpfully points me toward where to put my focus for a day. (All images courtesy of Katie Parrott.)&lt;/figcaption&gt;&lt;/div&gt;&lt;/div&gt;&lt;p&gt;&lt;br&gt;&lt;/p&gt;&lt;p&gt;Most AI-at-work advice starts with the part of your job you’re already good at: Write faster, code faster, analyze faster, ship more. I’m interested in the other side of the equation: using AI to support the part of work that makes it hard to believe you’re &lt;u&gt;&lt;a href="https://every.to/working-overtime/i-asked-claude-the-question-i-could-never-ask-my-boss" rel="noopener noreferrer" target="_blank"&gt;good at your job&lt;/a&gt;&lt;/u&gt;.&lt;/p&gt;&lt;p&gt;I’ve set up project management with both my &lt;u&gt;&lt;a href="https://every.to/plus-one" rel="noopener noreferrer" target="_blank"&gt;Plus One agent&lt;/a&gt;&lt;/u&gt;, Margot, and as a &lt;u&gt;&lt;a href="https://openai.com/index/introducing-workspace-agents-in-chatgpt/" rel="noopener noreferrer" target="_blank"&gt;ChatGPT agent&lt;/a&gt;&lt;/u&gt;. I’m featuring the ChatGPT agent here, but you can create your own project manager with any system that gives you a combination of memory, context, and intelligence—more on that below.&lt;/p&gt;&lt;h2&gt;Why AI can babysit my to-do list now &lt;/h2&gt;&lt;p&gt;I’d tried using ChatGPT as a project manager before, during a freelance month last year when I’d overbooked myself and had deadlines staring me down like unread letters from the IRS. I would open a new chat and type some version of: “I have this deadline, this deadline, and this deadline; this meeting, this meeting, and this meeting. What should I do?”&lt;/p&gt;&lt;p&gt;For one-off triage, it worked well enough. The problem was the context that it had about me—or didn’t. Every time I came back, I had to explain everything again: the clients, the deadlines, the pieces in flight, the meetings, the priorities, the fact that one project was more important than another for reasons that were obvious to me and invisible to the chat window.&lt;/p&gt;&lt;div class="quill-block-image" id="quill-block-image-1777904333886-d0ferulcv" data-source="{&amp;quot;dom_id&amp;quot;:&amp;quot;quill-block-image-1777904333886-d0ferulcv&amp;quot;,&amp;quot;link&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4153/optimized_510b7535-368a-4892-b700-fa23df36fdb4.png&amp;quot;,&amp;quot;image&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4153/optimized_510b7535-368a-4892-b700-fa23df36fdb4.png&amp;quot;,&amp;quot;caption&amp;quot;:&amp;quot;A glimpse of my ChatGPT project management system, manually informing the AI of my deadlines day by day.&amp;quot;,&amp;quot;error&amp;quot;:null}"&gt;&lt;div&gt;&lt;a href="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4153/optimized_510b7535-368a-4892-b700-fa23df36fdb4.png" target="_blank" rel="noopener noreferrer"&gt;&lt;img src="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4153/optimized_510b7535-368a-4892-b700-fa23df36fdb4.png" alt="A glimpse of my ChatGPT project management system, manually informing the AI of my deadlines day by day."&gt;&lt;/a&gt;&lt;figcaption class="quill-image-caption"&gt;A glimpse of my ChatGPT project management system, manually informing the AI of my deadlines day by day.&lt;/figcaption&gt;&lt;/div&gt;&lt;/div&gt;&lt;p&gt;&lt;br&gt;&lt;/p&gt;&lt;p&gt;Then, over the past six months, several things converged to make more comprehensive project management using ChatGPT possible. &lt;/p&gt;&lt;p&gt;First, &lt;u&gt;&lt;a href="https://every.to/also-true-for-humans/why-i-turned-off-chatgpt-s-memory" rel="noopener noreferrer" target="_blank"&gt;memory&lt;/a&gt;&lt;/u&gt; improved enough that the system could carry context and apply it across conversations. Next came advanced tool use, which enabled AI to navigate and use browsers and other tools. Integrations meant that ChatGPT could finally &lt;em&gt;do&lt;/em&gt; things like open my Notion, check my calendar, and read my Slack. Finally, products like OpenClaw and Every’s &lt;u&gt;&lt;a href="https://every.to/plus-one" rel="noopener noreferrer" target="_blank"&gt;Plus One&lt;/a&gt;&lt;/u&gt; wrapped all this firepower in a package that even I, a technical neophyte, can work with. &lt;/p&gt;&lt;p&gt;If you tried to do something with AI a year ago—like manage a marketing workflow or run an analysis of financial results—and it didn’t take, try again. Chances are that the model and the product around it have shifted in ways that move the finish line in your favor. It was time for me to take another swing at AI-native project management. &lt;/p&gt;&lt;h2&gt;What I built: A project management agent&lt;/h2&gt;&lt;p&gt;Saying “I built an agent” makes the whole thing sound more sophisticated than it is. The truth is that AI did most of the work—I just put the right information in places AI could see it, connected the tools and software where my work happens, and described the job I wanted done.&lt;/p&gt;&lt;h4&gt;Context to shape the agent’s memory&lt;/h4&gt;&lt;p&gt;With context, the agent can turn a vague goal into Thursday’s first task. Without it, it’s just a Magic 8 Ball for to-do lists.&lt;/p&gt;&lt;p&gt;So, as I was going through the setup for my agent (which you can do directly through the chat interface), I made sure to provide plenty of documentation for the agent-builder to build on top of. Most importantly, I gave it a link to a &lt;strong&gt;&lt;u&gt;&lt;a href="https://proofeditor.ai/" rel="noopener noreferrer" target="_blank"&gt;Proof&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; document with my OKRs, four objectives, a dozen-ish key results, and a rough sense of a stack-ranking of projects. Then I asked it to do the first piece of project management I am worst at: I asked it to turn “a successful quarter” into concrete phases, milestones, deadlines, and tasks.&lt;/p&gt;&lt;div class="quill-block-image" id="quill-block-image-1777904333887-5ph4lges7" data-source="{&amp;quot;dom_id&amp;quot;:&amp;quot;quill-block-image-1777904333887-5ph4lges7&amp;quot;,&amp;quot;link&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4153/optimized_580fbc95-f395-4317-a42a-340986821c1b.png&amp;quot;,&amp;quot;image&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4153/optimized_580fbc95-f395-4317-a42a-340986821c1b.png&amp;quot;,&amp;quot;caption&amp;quot;:&amp;quot;The agent broke my OKRs down into a week-by-week action plan, then converted that into tasks for my Notion to-do list.&amp;quot;,&amp;quot;error&amp;quot;:null}"&gt;&lt;div&gt;&lt;a href="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4153/optimized_580fbc95-f395-4317-a42a-340986821c1b.png" target="_blank" rel="noopener noreferrer"&gt;&lt;img src="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4153/optimized_580fbc95-f395-4317-a42a-340986821c1b.png" alt="The agent broke my OKRs down into a week-by-week action plan, then converted that into tasks for my Notion to-do list."&gt;&lt;/a&gt;&lt;figcaption class="quill-image-caption"&gt;The agent broke my OKRs down into a week-by-week action plan, then converted that into tasks for my Notion to-do list.&lt;/figcaption&gt;&lt;/div&gt;&lt;/div&gt;&lt;p&gt;&lt;br&gt;&lt;/p&gt;&lt;p&gt;“Stand up a reliable &lt;u&gt;&lt;a href="https://every.to/vibe-check" rel="noopener noreferrer" target="_blank"&gt;Vibe Check&lt;/a&gt;&lt;/u&gt; pipeline” is a concrete goal, but not something you can do on a Thursday afternoon. The agent broke it into smaller pieces: Audit the existing process, draft a brief outlining suggested changes, solicit feedback, and implement the changes. &lt;/p&gt;&lt;p&gt;The first useful thing the agent gave me was a draft to respond to. Some of the tasks were so abstract I couldn’t tell where to start, and others were so chunky they were really projects in disguise. So I went back and forth with the agent to set a few parameters—mostly telling it, “This is too confusing for me to act on”—and it split, renamed, and rewrote the items until the plan had been divided into projects and tasks that were doable. &lt;/p&gt;&lt;p&gt;Then the tasks went into Notion, where they became a board with deadlines, statuses, and linked OKRs. &lt;/p&gt;&lt;h4&gt;Integrations give the AI places to act&lt;/h4&gt;&lt;p&gt;The next step was adding integrations so that the agent could track my work across tools.&lt;/p&gt;&lt;p&gt;ChatGPT agents make this almost embarrassingly easy now. In a few clicks, I connected the agent to the places where my work already lives: Notion, Slack, Google Drive, and Calendar. &lt;/p&gt;&lt;div class="quill-block-image" id="quill-block-image-1777904333889-b1rrj60bg" data-source="{&amp;quot;dom_id&amp;quot;:&amp;quot;quill-block-image-1777904333889-b1rrj60bg&amp;quot;,&amp;quot;link&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4153/optimized_b8f55262-f257-4fe1-957a-4cdfacf2071b.png&amp;quot;,&amp;quot;image&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4153/optimized_b8f55262-f257-4fe1-957a-4cdfacf2071b.png&amp;quot;,&amp;quot;caption&amp;quot;:&amp;quot;The dashboard for my project manager agent, complete with integrated apps, context files, and memory.&amp;quot;,&amp;quot;error&amp;quot;:null}"&gt;&lt;div&gt;&lt;a href="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4153/optimized_b8f55262-f257-4fe1-957a-4cdfacf2071b.png" target="_blank" rel="noopener noreferrer"&gt;&lt;img src="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4153/optimized_b8f55262-f257-4fe1-957a-4cdfacf2071b.png" alt="The dashboard for my project manager agent, complete with integrated apps, context files, and memory."&gt;&lt;/a&gt;&lt;figcaption class="quill-image-caption"&gt;The dashboard for my project manager agent, complete with integrated apps, context files, and memory.&lt;/figcaption&gt;&lt;/div&gt;&lt;/div&gt;&lt;p&gt;&lt;br&gt;&lt;/p&gt;&lt;p&gt;This is the part that would not have worked a year ago. Back then, ChatGPT only knew what I remembered to paste into the chat box—it couldn’t take action on my behalf. Now the agent can read the systems I already use. It can see on my calendar that Thursday morning is open, that a discussion on a Slack thread created a new task for me to do, that an article draft exists somewhere in Drive, and that a project belongs to an OKR and isn’t just a guilty little cloud floating around on Notion.&lt;/p&gt;&lt;h4&gt;Instructions tell the agent what to do&lt;/h4&gt;&lt;p&gt;Context tells the agent what matters. Integrations tell it where to look. Instructions tell it what to do. I had to write fewer of them than I expected.&lt;/p&gt;&lt;p&gt;I opened the &lt;u&gt;&lt;a href="https://openai.com/index/introducing-workspace-agents-in-chatgpt/" rel="noopener noreferrer" target="_blank"&gt;ChatGPT agent&lt;/a&gt;&lt;/u&gt; builder, which you can find in the left-hand sidebar of the ChatGPT web app. Then I explained, in plain English, what I wanted: a project-management agent that would help me organize each week and keep my quarterly objectives on track. The builder turned that into a fuller brief with its role, workflows, and instructions on how to deliver responses, where to store information for future reference, and what NOT to do (for example, invent a status or deadline).&lt;/p&gt;&lt;div class="quill-block-image" id="quill-block-image-1777904333889-kty2j159z" data-source="{&amp;quot;dom_id&amp;quot;:&amp;quot;quill-block-image-1777904333889-kty2j159z&amp;quot;,&amp;quot;link&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4153/optimized_9f0a7c6a-5e14-4a39-b0a3-d64d33ef4690.png&amp;quot;,&amp;quot;image&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4153/optimized_9f0a7c6a-5e14-4a39-b0a3-d64d33ef4690.png&amp;quot;,&amp;quot;caption&amp;quot;:&amp;quot;The beginning of the instructions that power my project management agent.&amp;quot;,&amp;quot;error&amp;quot;:null}"&gt;&lt;div&gt;&lt;a href="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4153/optimized_9f0a7c6a-5e14-4a39-b0a3-d64d33ef4690.png" target="_blank" rel="noopener noreferrer"&gt;&lt;img src="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4153/optimized_9f0a7c6a-5e14-4a39-b0a3-d64d33ef4690.png" alt="The beginning of the instructions that power my project management agent."&gt;&lt;/a&gt;&lt;figcaption class="quill-image-caption"&gt;The beginning of the instructions that power my project management agent.&lt;/figcaption&gt;&lt;/div&gt;&lt;/div&gt;&lt;p&gt;&lt;br&gt;&lt;/p&gt;&lt;p&gt;Ultimately, the instructions I care about boil down to this: Help me organize the week, keep the quarterly objectives on track, and do the useful work first instead of requiring so much input from me that I might as well have gone in and looked at all the inputs myself.  I might as well have &lt;/p&gt;&lt;h2&gt;I can’t automate the ‘me’ of it all&lt;/h2&gt;&lt;p&gt;I may be offloading a type of work that I hate and am bad at, but I’m also learning new skills—or relearning them for the agentic era. Mostly, these lessons emerge through failure.&lt;/p&gt;&lt;p&gt;Oftentimes, the failure is one of communication. It took time to get in the habit of keeping my agent up-to-date on the details it &lt;em&gt;can’t &lt;/em&gt;see. An article would be published, and I’d forget to tell the agent or move the card in Notion that corresponded to it. Deadlines moved while Notion stayed stuck on the old date, and the agent became about as useful as my dog when I tell her to go get a toy from upstairs.&lt;/p&gt;&lt;div class="quill-block-image" id="quill-block-image-1777904333889-aqdcrpx00" data-source="{&amp;quot;dom_id&amp;quot;:&amp;quot;quill-block-image-1777904333889-aqdcrpx00&amp;quot;,&amp;quot;link&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4153/optimized_d4b0cddb-3be3-4442-a1e1-7925cb3769d0.png&amp;quot;,&amp;quot;image&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4153/optimized_d4b0cddb-3be3-4442-a1e1-7925cb3769d0.png&amp;quot;,&amp;quot;caption&amp;quot;:&amp;quot;My Notion to-do list functions as the source of truth for me and the agent about the status of projects. If it’s not up-to-date, the whole system falls apart.&amp;quot;,&amp;quot;error&amp;quot;:null}"&gt;&lt;div&gt;&lt;a href="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4153/optimized_d4b0cddb-3be3-4442-a1e1-7925cb3769d0.png" target="_blank" rel="noopener noreferrer"&gt;&lt;img src="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4153/optimized_d4b0cddb-3be3-4442-a1e1-7925cb3769d0.png" alt="My Notion to-do list functions as the source of truth for me and the agent about the status of projects. If it’s not up-to-date, the whole system falls apart."&gt;&lt;/a&gt;&lt;figcaption class="quill-image-caption"&gt;My Notion to-do list functions as the source of truth for me and the agent about the status of projects. If it’s not up-to-date, the whole system falls apart.&lt;/figcaption&gt;&lt;/div&gt;&lt;/div&gt;&lt;p&gt;&lt;br&gt;&lt;/p&gt;&lt;p&gt;I have to tell the agent when a draft is in review or is published, a deadline changes, or a new task appears in a meeting. Updating a Notion page is annoying. But annoying is better than carrying the whole quarter in my head. &lt;/p&gt;&lt;p&gt;Another wrinkle is the “me” problem. The agent can’t change my personality. It can’t make me less anxious or more confident in my ideas. So, for example, I’ve been sitting on a proposal for my biggest Q2 project for a week because I can’t convince myself it’s good enough to send. The agent knows this. It reminds me that it’s overdue every day. And I keep avoiding it. The agent can draft the email and flag the delay, but it can’t tell me if the idea is good. That part—deciding to believe in the thing you made—is still mine. AI, it turns out, is no match for my neuroticism.&lt;/p&gt;&lt;h2&gt;Knowing while there’s still time&lt;/h2&gt;&lt;p&gt;Near the end of every week, I ask the agent for the thing I used to dread the most: a status report. It reviews the work that was supposed to get done, what moved, what slipped, and which goals are starting to look further from reach. Sometimes the answer is satisfying. Sometimes it is rude in the way accurate things are rude.&lt;/p&gt;&lt;p&gt;One day recently, I asked it for a report on my OKR progress: One project had momentum but needed a cleaner path to delivery; another looked healthy, but only if I had artifacts to show for it that the agent couldn’t see; my publishing cadence was fine, but would be better if I set up the idea backlog the agent and I had talked about. &lt;/p&gt;&lt;div class="quill-block-image" id="quill-block-image-1777904333889-lzsgre33g" data-source="{&amp;quot;dom_id&amp;quot;:&amp;quot;quill-block-image-1777904333889-lzsgre33g&amp;quot;,&amp;quot;link&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4153/optimized_fa284e4b-af80-45d2-9e95-67820cfd442b.png&amp;quot;,&amp;quot;image&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4153/optimized_fa284e4b-af80-45d2-9e95-67820cfd442b.png&amp;quot;,&amp;quot;caption&amp;quot;:&amp;quot;The agent’s take on the status of my three active OKRs. There’s nothing on fire, but it gives me a sense of where to put my focus in the next few weeks.&amp;quot;,&amp;quot;error&amp;quot;:null}"&gt;&lt;div&gt;&lt;a href="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4153/optimized_fa284e4b-af80-45d2-9e95-67820cfd442b.png" target="_blank" rel="noopener noreferrer"&gt;&lt;img src="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4153/optimized_fa284e4b-af80-45d2-9e95-67820cfd442b.png" alt="The agent’s take on the status of my three active OKRs. There’s nothing on fire, but it gives me a sense of where to put my focus in the next few weeks."&gt;&lt;/a&gt;&lt;figcaption class="quill-image-caption"&gt;The agent’s take on the status of my three active OKRs. There’s nothing on fire, but it gives me a sense of where to put my focus in the next few weeks.&lt;/figcaption&gt;&lt;/div&gt;&lt;/div&gt;&lt;p&gt;&lt;br&gt;&lt;/p&gt;&lt;p&gt;This is the kind of thing a competent project manager would probably notice in a 20-minute check-in. Which is exactly what I want from the agent: making the obvious visible before it becomes a delay that turns into a problem that snowballs into a failed objective or, worse, a disappointed teammate. &lt;/p&gt;&lt;p&gt;For most of my career, deadlines and prioritization felt like weather systems: suddenly overhead, occasionally catastrophic, mostly outside my control. Now I can see the front forming in time to take action. &lt;/p&gt;&lt;p&gt;If AI has only been helping you with the part of work you already do well, try pointing it at the part you have been avoiding. If the promise of AI is that it frees up humans to do what only humans can do, that should include freeing us &lt;em&gt;from&lt;/em&gt; things we hate to do. Otherwise, what’s the point?  &lt;/p&gt;&lt;p&gt;I am still bad at project management. The part of work that makes me feel like I am faking adulthood still exists. But I have support for that now, so the writing gets the hours it deserves. &lt;/p&gt;&lt;h2&gt;&lt;strong&gt;Build your own project manager &lt;/strong&gt;&lt;/h2&gt;&lt;p&gt;If you want to set up your own project-management agent, here’s what I’d gather before you open the agent builder.&lt;/p&gt;&lt;h4&gt;&lt;strong&gt;1. Context: The documents to feed it&lt;/strong&gt;&lt;/h4&gt;&lt;p&gt;Think of this as the agent’s onboarding material. The more it can read about your priorities, the less you’ll have to repeat in chat.&lt;/p&gt;&lt;ul&gt;&lt;li&gt;&lt;strong&gt;OKRs or quarterly goals.&lt;/strong&gt; The single most important file. If you don’t have written OKRs, write a one-page version of what a successful quarter looks like—your objectives, the rough metrics that prove them, and any projects you’ve already committed to.&lt;/li&gt;&lt;li&gt;&lt;strong&gt;Strategy or planning docs.&lt;/strong&gt; Anything that explains the &lt;em&gt;why&lt;/em&gt; behind the work: team strategy memos, annual plans, project briefs, and kickoff documents.&lt;/li&gt;&lt;li&gt;&lt;strong&gt;Workstream documentation.&lt;/strong&gt; Standing responsibilities you want the agent to know about, such as your editorial calendar, cadence of the content you publish, and recurring meetings.&lt;/li&gt;&lt;li&gt;&lt;strong&gt;A stack-rank of your goals.&lt;/strong&gt; Which OKR matters most? Which project is the one you’d protect if everything else slipped? Write this down. &lt;/li&gt;&lt;/ul&gt;&lt;h4&gt;&lt;strong&gt;2. Integrations: Connect the tools where you work&lt;/strong&gt;&lt;/h4&gt;&lt;p&gt;Connect the systems where the work actually lives. &lt;/p&gt;&lt;ul&gt;&lt;li&gt;&lt;strong&gt;A task manager.&lt;/strong&gt; Notion, Todoist, Asana, Linear, or whatever you already use. This becomes the source of truth for the status of your work. If you don’t have one, set one up before you build the agent.&lt;/li&gt;&lt;li&gt;&lt;strong&gt;Your calendar.&lt;/strong&gt; Google or Outlook. The agent needs to see where your time is spent versus where you said it would be spent.&lt;/li&gt;&lt;li&gt;&lt;strong&gt;Slack or your team chat.&lt;/strong&gt; This allows the agent to pick up tasks that get assigned in conversation and never make it into your task manager.&lt;/li&gt;&lt;li&gt;&lt;strong&gt;Cloud drive.&lt;/strong&gt; Google Drive, Dropbox, OneDrive, or wherever your drafts and working documents live.&lt;/li&gt;&lt;/ul&gt;&lt;h4&gt;&lt;strong&gt;3. The prompt&lt;/strong&gt;&lt;/h4&gt;&lt;p&gt;Here’s the brief I gave my agent builder. Keep the structure and adapt the specifics to your work.&lt;/p&gt;&lt;div class="quill-code-snippet code-snippet" id="quill-code-snippet-1777904346525" data-code-snippet="" data-source="{&amp;quot;id&amp;quot;:&amp;quot;quill-code-snippet-1777904346525&amp;quot;,&amp;quot;title&amp;quot;:&amp;quot;Project manager agent prompt&amp;quot;,&amp;quot;language&amp;quot;:&amp;quot;other&amp;quot;,&amp;quot;code&amp;quot;:&amp;quot;You are my project manager. Your job is to help me organize each week and keep my quarterly objectives on track.\nYou have access to my OKRs, my Notion to-do list, my calendar, my Slack, and my Drive. Treat my OKR document as the source of truth for what matters this quarter, and treat Notion as the source of truth for project status.\nEach Monday, give me a one-page plan for the week: what's due, what's at risk, and what I should focus on first, based on which OKR each task ladders up to. Each Friday, give me a status report: what got done, what slipped, and which goals are starting to look further from reach.\nWhen I ask, \&amp;quot;What should I work on now?\&amp;quot;, check my calendar for available time and my Notion board for open tasks, then recommend one thing—not five.\nDon't invent statuses, deadlines, or tasks. If a date isn't in Notion, say so. If a task is ambiguous, ask me one clarifying question rather than guessing.\nProtect my stated priorities from my daily impulses. If I ask for help with something that isn't on the OKR list, flag it before you help.\n&amp;quot;,&amp;quot;show_claude&amp;quot;:false,&amp;quot;show_chatgpt&amp;quot;:false,&amp;quot;show_gemini&amp;quot;:false,&amp;quot;show_copy&amp;quot;:true}"&gt;
      &lt;div class="code-snippet-header"&gt;
        &lt;div class="code-snippet-header-left"&gt;
          &lt;span class="code-snippet-title"&gt;Project manager agent prompt&lt;/span&gt;
          &lt;span class="code-snippet-lang-badge"&gt;Other&lt;/span&gt;
        &lt;/div&gt;
        &lt;div class="code-snippet-actions"&gt;&lt;button class="code-snippet-btn" aria-label="Copy code" data-tip="Copy code" data-copy-code=""&gt;&lt;svg width="16" height="16" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"&gt;&lt;rect x="9" y="9" width="13" height="13" rx="2" ry="2"&gt;&lt;/rect&gt;&lt;path d="M5 15H4a2 2 0 0 1-2-2V4a2 2 0 0 1 2-2h9a2 2 0 0 1 2 2v1"&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/button&gt;&lt;/div&gt;
      &lt;/div&gt;
      &lt;div class="code-snippet-body"&gt;
        &lt;div class="code-snippet-gutter" aria-hidden="true"&gt;&lt;span class="code-snippet-line-num"&gt;1&lt;/span&gt;&lt;span class="code-snippet-line-num"&gt;2&lt;/span&gt;&lt;span class="code-snippet-line-num"&gt;3&lt;/span&gt;&lt;span class="code-snippet-line-num"&gt;4&lt;/span&gt;&lt;span class="code-snippet-line-num"&gt;5&lt;/span&gt;&lt;span class="code-snippet-line-num"&gt;6&lt;/span&gt;&lt;/div&gt;
        &lt;pre class="code-snippet-code" data-code-text=""&gt;You are my project manager. Your job is to help me organize each week and keep my quarterly objectives on track.
You have access to my OKRs, my Notion to-do list, my calendar, my Slack, and my Drive. Treat my OKR document as the source of truth for what matters this quarter, and treat Notion as the source of truth for project status.
Each Monday, give me a one-page plan for the week: what&lt;span class="cs-string"&gt;'s due, what'&lt;/span&gt;s at risk, and what I should focus on first, based on which OKR each task ladders up to. Each Friday, give me a status report: what got done, what slipped, and which goals are starting to look further from reach.
When I ask, &lt;span class="cs-string"&gt;"What should I work on now?"&lt;/span&gt;, check my calendar for available time and my Notion board for open tasks, then recommend one thing—not five.
Don&lt;span class="cs-string"&gt;'t invent statuses, deadlines, or tasks. If a date isn'&lt;/span&gt;t in Notion, say so. If a task is ambiguous, ask me one clarifying question rather than guessing.
Protect my stated priorities from my daily impulses. If I ask for help with something that isn&lt;span class="cs-string"&gt;'t on the OKR list, flag it before you help.&lt;/span&gt;&lt;/pre&gt;
      &lt;/div&gt;
    &lt;/div&gt;&lt;p&gt;&lt;br&gt;&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;strong&gt;&lt;em&gt;&lt;a href="https://every.to/@katie.parrott12" rel="noopener noreferrer" target="_blank"&gt;Katie Parrott&lt;/a&gt;&lt;/em&gt;&lt;/strong&gt; &lt;em&gt;is a staff writer at Every. You can read more of her work in&lt;/em&gt; &lt;em&gt;&lt;a href="https://katieparrott.substack.com/" rel="noopener noreferrer" target="_blank"&gt;her newsletter&lt;/a&gt;.&lt;/em&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;To read more essays like this, subscribe to &lt;u&gt;&lt;a href="https://every.to/subscribe" rel="noopener noreferrer" target="_blank"&gt;Every&lt;/a&gt;&lt;/u&gt;, and follow us on X at &lt;u&gt;&lt;a href="http://twitter.com/every" rel="noopener noreferrer" target="_blank"&gt;@every&lt;/a&gt;&lt;/u&gt; and on &lt;u&gt;&lt;a href="https://www.linkedin.com/company/everyinc/" rel="noopener noreferrer" target="_blank"&gt;LinkedIn&lt;/a&gt;&lt;/u&gt;.&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;For sponsorship opportunities, reach out to sponsorships@every.to.&lt;/em&gt;&lt;/p&gt;&lt;div class="quill-button" data-source="{&amp;quot;id&amp;quot;:&amp;quot;quill-button-1769187301610&amp;quot;,&amp;quot;url&amp;quot;:&amp;quot;https://every.to/subscribe?source=post_button&amp;quot;,&amp;quot;text&amp;quot;:&amp;quot;Subscribe&amp;quot;}" id="quill-button-1769187301610"&gt;&lt;a href="https://every.to/subscribe?source=post_button"&gt;Subscribe&lt;/a&gt;&lt;/div&gt;</description>
      <author>Katie Parrott / Working Overtime</author>
      <pubDate>2026-05-04 11:00:00 -0400</pubDate>
      <guid>https://every.to/working-overtime/i-let-chatgpt-manage-my-workweek</guid>
      <link>https://every.to/working-overtime/i-let-chatgpt-manage-my-workweek</link>
    </item>
    <item>
      <title>Codex Goes to Work</title>
      <description>&lt;table&gt;&lt;tr&gt;&lt;td&gt;&lt;img alt="Context Window" src="https://d24ovhgu8s7341.cloudfront.net/uploads/publication/logo/94/small_context_windown_1.png" /&gt;&lt;/td&gt;&lt;td&gt;&lt;/td&gt;&lt;td&gt;&lt;table&gt;&lt;tr&gt;&lt;td&gt;by &lt;a href="https://every.to/@Every%20Staff" itemprop="name"&gt;Every Staff&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;in &lt;a href="https://every.to/context-window"&gt;Context Window&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;figure&gt;&lt;img src="https://d24ovhgu8s7341.cloudfront.net/uploads/post/cover/4152/full_page_cover_f901785a9089fc9e-Codex_Goes_to_Work.png"&gt;&lt;figcaption&gt;Midjourney/Every illustration.&lt;/figcaption&gt;&lt;/figure&gt;&lt;p&gt;&lt;em&gt;Hello, and happy Sunday! Was this newsletter forwarded to you? &lt;u&gt;&lt;a href="https://every.to/account" rel="noopener noreferrer" target="_blank"&gt;Sign up&lt;/a&gt;&lt;/u&gt; to get it in your inbox.&lt;/em&gt;&lt;/p&gt;&lt;h2&gt;&lt;hr class="quill-line"&gt;&lt;/h2&gt;&lt;h2&gt;Knowledge base&lt;/h2&gt;&lt;p&gt;&lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/guides/ai-product-management-guide" rel="noopener noreferrer" target="_blank"&gt;“A Guide to Agent-native Product Management”&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; &lt;em&gt;by &lt;a href="https://every.to/@marcus_fd8302_1" rel="noopener noreferrer" target="_blank"&gt;Marcus Moretti&lt;/a&gt;/Guides&lt;/em&gt;: &lt;strong&gt;&lt;u&gt;&lt;a href="mailto:marcus@every.to" rel="noopener noreferrer" target="_blank"&gt;Marcus Moretti&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; runs Spiral as a one-person team. This guide walks through the two new compound engineering skills that make it possible: /ce:strategy, which interviews you to produce a strategy document, and /ce:product-pulse, which replaces your analytics tools with a founder-style analyst briefing that saves to a folder as your product’s running memory. Read this to set up both commands for your own product and understand how they plug into the broader plan-ship-review loop. &lt;strong&gt;Plus:&lt;/strong&gt; The one thing Marcus still writes himself is the roadmap. Read the &lt;u&gt;&lt;a href="https://every.to/p/claude-code-for-product-managers" rel="noopener noreferrer" target="_blank"&gt;accompanying essay&lt;/a&gt;&lt;/u&gt; for his full workflow, plus his two-part test for which SaaS products will survive the agent era.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/also-true-for-humans/you-are-the-most-expensive-model" rel="noopener noreferrer" target="_blank"&gt;“You Are the Most Expensive Model”&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; &lt;em&gt;by&lt;/em&gt; &lt;em&gt;&lt;u&gt;&lt;a href="https://every.to/@mike_2114" rel="noopener noreferrer" target="_blank"&gt;Mike Taylor&lt;/a&gt;&lt;/u&gt;/Also True for Humans:&lt;/em&gt; Most teams are routing entire workflows through frontier models when cheaper, faster alternatives would do the job just as well. The real cost isn’t the tokens—it’s your attention. &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@mike_2114" rel="noopener noreferrer" target="_blank"&gt;Mike Taylor&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; introduces incremental determinism: a four-level framework for deciding which tasks deserve Opus and which can be handed to Haiku, a script, or no model at all. Read this to know exactly which lever to pull when your AI costs start to add up.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/context-window/one-app-to-rule-all-knowledge-work" rel="noopener noreferrer" target="_blank"&gt;“One App to Rule All Knowledge Work”&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; &lt;em&gt;by&lt;/em&gt; &lt;em&gt;&lt;u&gt;&lt;a href="https://every.to/@katie.parrott12" rel="noopener noreferrer" target="_blank"&gt;Katie Parrott&lt;/a&gt;&lt;/u&gt;/Context Window:&lt;/em&gt; &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@tedescau" rel="noopener noreferrer" target="_blank"&gt;Austin Tedesco&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; now runs 80 percent of his daily workflow through Codex, a tool he called “trash” for non-engineers just months ago. &lt;strong&gt;Plus: &lt;/strong&gt;why Austin reviews every agent output in its destination app, a prompt for letting agents design their own automations, and how to use Every’s compound knowledge plugin to catch confidently wrong data before a plan gets enacted.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/context-window/compute-is-the-new-cash" rel="noopener noreferrer" target="_blank"&gt;“Compute Is the New Cash”&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; &lt;em&gt;by &lt;a href="https://every.to/@laura_27bbaf_1" rel="noopener noreferrer" target="_blank"&gt;Laura Entis&lt;/a&gt;/Context Window:&lt;/em&gt; On &lt;em&gt;AI &amp;amp; I&lt;/em&gt;, &lt;strong&gt;Emily Glassberg Sands&lt;/strong&gt;, head of data and AI at Stripe, talks to &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@danshipper" rel="noopener noreferrer" target="_blank"&gt;Dan Shipper&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; about how agents are becoming economic participants—and why fraud is now a full-funnel problem, not just a checkout one. &lt;strong&gt;Plus:&lt;/strong&gt; GitHub and Anthropic are both moving to usage-based pricing as flat-rate subscriptions break down under agentic workloads; Dan&lt;strong&gt; &lt;/strong&gt;and &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@kieran_1355" rel="noopener noreferrer" target="_blank"&gt;Kieran Klaassen&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; offer contrasting takes on whether you should talk to your agents or just let them work; and &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@naveen_6804" rel="noopener noreferrer" target="_blank"&gt;Naveen Naidu&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;‘s three-step workflow for turning post-launch customer feedback into a product queue. 🎧 🖥 Listen on &lt;u&gt;&lt;a href="https://open.spotify.com/episode/1pR0DddFi6645oTlOX9uq9?si=5jU2B7j6RgOvLretK1fHjg" rel="noopener noreferrer" target="_blank"&gt;Spotify&lt;/a&gt;&lt;/u&gt; or &lt;u&gt;&lt;a href="https://podcasts.apple.com/us/podcast/how-stripe-is-building-for-an-agent-native-world/id1719789201?i=1000764518115" rel="noopener noreferrer" target="_blank"&gt;Apple Podcasts&lt;/a&gt;&lt;/u&gt;, or watch on &lt;u&gt;&lt;a href="https://x.com/danshipper/status/2049512129846530086" rel="noopener noreferrer" target="_blank"&gt;X&lt;/a&gt;&lt;/u&gt; or &lt;u&gt;&lt;a href="https://www.youtube.com/watch?v=-gOyup6yLBY" rel="noopener noreferrer" target="_blank"&gt;YouTube&lt;/a&gt;&lt;/u&gt;.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/context-window/who-isnt-using-gpt-55" rel="noopener noreferrer" target="_blank"&gt;“Who Isn’t Using GPT 5.5”&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; &lt;em&gt;by &lt;a href="https://every.to/@laura_27bbaf_1" rel="noopener noreferrer" target="_blank"&gt;Laura Entis&lt;/a&gt;/Context Window:&lt;/em&gt; One week after GPT-5.5’s release, the Every team checks in: Kieran&lt;strong&gt; &lt;/strong&gt;is now splitting his time evenly between Codex and Claude Code, but &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@natalia_2944" rel="noopener noreferrer" target="_blank"&gt;Natalia Quintero&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; ran a head-to-head proposal test and her Claude agent won. &lt;strong&gt;Plus:&lt;/strong&gt; why six unicorn CTOs have stepped down to become Anthropic ICs; how Kieran hit 24 pull requests in a single day by having agents watch user complaint videos overnight; and &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@williewilliams" rel="noopener noreferrer" target="_blank"&gt;Willie Williams&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; on why AI has turned coding into a slot machine—and how to know when to walk away.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;h2&gt;Log on&lt;/h2&gt;&lt;p&gt;We host &lt;u&gt;&lt;a href="https://every.to/events" rel="noopener noreferrer" target="_blank"&gt;camps and workshops&lt;/a&gt;&lt;/u&gt; on topics like &lt;u&gt;&lt;a href="https://www.youtube.com/watch?v=7YUBxMTF1Tc&amp;amp;time_continue=3&amp;amp;source_ve_path=NzY3NTg&amp;amp;embeds_referring_euri=https%3A%2F%2Fevery.to%2F" rel="noopener noreferrer" target="_blank"&gt;compound engineering&lt;/a&gt;&lt;/u&gt; and &lt;u&gt;&lt;a href="https://www.youtube.com/watch?v=oEvjbPwGwnc&amp;amp;source_ve_path=OTY3MTQ&amp;amp;embeds_referring_euri=https%3A%2F%2Fevery.to%2F" rel="noopener noreferrer" target="_blank"&gt;writing with AI&lt;/a&gt;&lt;/u&gt; to share what we’ve learned from training teams at companies like the &lt;em&gt;&lt;u&gt;&lt;a href="https://every.to/on-every/the-next-chapter-of-every-consulting" rel="noopener noreferrer" target="_blank"&gt;New York Times&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;&lt;u&gt;&lt;a href="https://every.to/on-every/the-next-chapter-of-every-consulting" rel="noopener noreferrer" target="_blank"&gt; and leading hedge funds&lt;/a&gt;&lt;/u&gt;, and by using and experimenting with AI every day ourselves.&lt;/p&gt;&lt;h5&gt;&lt;strong&gt;Last week’s camp&lt;/strong&gt;&lt;/h5&gt;&lt;ul&gt;&lt;li&gt;&lt;strong&gt;&lt;a href="https://every.to/events/codex-for-knowledge-work" rel="noopener noreferrer" target="_blank"&gt;Codex for Knowledge Work Camp&lt;/a&gt;&lt;/strong&gt;: Dan and Austin showed how to use OpenAI’s Codex for drafting, research, summarizing, running tasks in parallel, and building small tools to automate routine knowledge work. &lt;u&gt;&lt;a href="https://every.to/events/codex-for-knowledge-work" rel="noopener noreferrer" target="_blank"&gt;Watch the recording&lt;/a&gt;&lt;/u&gt;.&lt;/li&gt;&lt;/ul&gt;&lt;h5&gt;&lt;strong&gt;Recordings you may have missed&lt;/strong&gt;&lt;/h5&gt;&lt;ul&gt;&lt;li&gt;&lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/" rel="noopener noreferrer" target="_blank"&gt;Compound Engineering Camp&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;: &lt;strong&gt;&lt;u&gt;&lt;a href="https://cora.computer/" rel="noopener noreferrer" target="_blank"&gt;Cora&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; general manager Kieran Klaassen and product leader &lt;strong&gt;Trevin Chow&lt;/strong&gt; walked through what’s new, went deeper on the brainstorm and ideate steps, and shared examples of using the compound engineering plugin in product-focused workflows. &lt;u&gt;&lt;a href="https://www.youtube.com/watch?v=lfML5OJc-CM" rel="noopener noreferrer" target="_blank"&gt;Watch the recording&lt;/a&gt;&lt;/u&gt;.&lt;/li&gt;&lt;/ul&gt;&lt;h2&gt;&lt;hr class="quill-line"&gt;&lt;/h2&gt;&lt;h2&gt;From Every Studio&lt;/h2&gt;&lt;h5&gt;&lt;strong&gt;Spiral lets you browse and restore old draft versions&lt;/strong&gt;&lt;/h5&gt;&lt;p&gt;Spiral added version history—you can now see how a draft evolved and roll back to an earlier version with one click. It also shipped two lightweight API endpoints for quick rewrites and made the onboarding flow noticeably smoother. &lt;/p&gt;&lt;h5&gt;&lt;strong&gt;Cora’s inbox has stars, voice dictation, and a smoother compose box&lt;/strong&gt;&lt;/h5&gt;&lt;p&gt;Cora’s inbox got a round of usability upgrades: a starred view for important threads, typed snooze durations, voice dictation, and a smoother compose experience. The app is also faster behind the scenes. Kieran is looking for a small group of alpha testers to help pressure-test the full inbox—if you’re interested, reach out to him at &lt;u&gt;&lt;a href="mailto:kieran@every.to" rel="noopener noreferrer" target="_blank"&gt;kieran@every.to&lt;/a&gt;&lt;/u&gt;. &lt;/p&gt;&lt;h5&gt;&lt;strong&gt;Monologue hands off recordings from Apple Watch to iPhone&lt;/strong&gt;&lt;/h5&gt;&lt;p&gt;Audio that is recorded on Apple Watch on &lt;strong&gt;&lt;u&gt;&lt;a href="https://www.monologue.to/" rel="noopener noreferrer" target="_blank"&gt;Monologue&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; gets synced across your other Apple devices. The Mac app also got better at meetings, with auto-stop when a meeting ends, more control over which apps trigger recording, and Webex joining Zoom and Teams as a supported platform.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;h2&gt;Alignment &lt;/h2&gt;&lt;p&gt;&lt;strong&gt;Downstream of speed. &lt;/strong&gt;The Food and Drug Administration &lt;u&gt;&lt;a href="https://www.bloomberg.com/news/articles/2026-04-28/fda-plans-to-speed-up-drug-trials-with-real-time-data-ai" rel="noopener noreferrer" target="_blank"&gt;announced&lt;/a&gt;&lt;/u&gt; this week that two cancer drugs—one from AstraZeneca, one from Amgen—will stream their trial data to the agency in real time. Did a patient develop a fever? Did liver enzymes rise? Did the tumor shrink? Instead of waiting for clinicians to collect, clean, and submit these signals between phases, the FDA will see them as they happen. The agency’s chief AI officer estimates this could cut 20 to 40 percent off the time it takes to get a drug from the lab to the pharmacy shelf.&lt;/p&gt;&lt;p&gt;The downstream effect of a faster approval process is a faster way to find out if a drug does not work. Most of what happens inside a pharmacological company’s research and development budget is paying smart people to find out, slowly and expensively, that the molecule is a dud—which the current system is optimized to find out as late as possible. With real-time data, the failure might show up in year one instead of year three, giving precious time for a patient to be re-routed to something that might work. &lt;/p&gt;&lt;p&gt;Structurally, medicine is starting to behave like software. Silicon Valley says move fast and break things, while healthcare has always said the opposite, for the obvious reason that the thing being broken is a person. I’m starting to believe that AI might be the first tool that lets medicine have it both ways.—&lt;em&gt;&lt;u&gt;&lt;a href="https://x.com/Ashwinreads" rel="noopener noreferrer" target="_blank"&gt;Ashwin Sharma&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;&lt;/p&gt;&lt;h3&gt;&lt;hr class="quill-line"&gt;&lt;/h3&gt;&lt;p&gt;&lt;em&gt;Correction: This article was updated to reflect that Monologue syncs your audio across Apple devices, but cannot hand over a recording in progress. &lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;That’s all for this week! Be sure to follow Every on X at &lt;u&gt;&lt;a href="https://twitter.com/every" rel="noopener noreferrer" target="_blank"&gt;@every&lt;/a&gt;&lt;/u&gt; and on &lt;u&gt;&lt;a href="https://www.linkedin.com/company/everyinc/" rel="noopener noreferrer" target="_blank"&gt;LinkedIn&lt;/a&gt;&lt;/u&gt;.&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;We &lt;u&gt;&lt;a href="https://every.to/studio" rel="noopener noreferrer" target="_blank"&gt;build AI tools&lt;/a&gt;&lt;/u&gt; for readers like you. Write brilliantly with &lt;u&gt;&lt;a href="https://writewithspiral.com/" rel="noopener noreferrer" target="_blank"&gt;Spiral&lt;/a&gt;&lt;/u&gt;. Organize files automatically with &lt;/em&gt;&lt;strong&gt;&lt;em&gt;&lt;u&gt;&lt;a href="https://makeitsparkle.co/?utm_source=everyfooter" rel="noopener noreferrer" target="_blank"&gt;Sparkle&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;&lt;/strong&gt;&lt;em&gt;. Deliver yourself from email with &lt;u&gt;&lt;a href="https://cora.computer" rel="noopener noreferrer" target="_blank"&gt;Cora&lt;/a&gt;&lt;/u&gt;. Dictate effortlessly with &lt;u&gt;&lt;a href="https://monologue.to" rel="noopener noreferrer" target="_blank"&gt;Monologue&lt;/a&gt;&lt;/u&gt;. Work on documents with AI agents using &lt;/em&gt;&lt;strong&gt;&lt;em&gt;&lt;u&gt;&lt;a href="https://www.proofeditor.ai/?source=post_button" rel="noopener noreferrer" target="_blank"&gt;Proof&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;&lt;/strong&gt;&lt;em&gt;.&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;For sponsorship opportunities, reach out to sponsorships@every.to.&lt;/em&gt;&lt;/p&gt;&lt;div class="quill-button" data-source="{&amp;quot;id&amp;quot;:&amp;quot;quill-button-1777667995478&amp;quot;,&amp;quot;text&amp;quot;:&amp;quot;Upgrade to paid&amp;quot;,&amp;quot;url&amp;quot;:&amp;quot;https://every.to/subscribe?source=post_button&amp;quot;}" id="quill-button-1777667995478"&gt;&lt;a href="https://every.to/subscribe?source=post_button"&gt;Upgrade to paid&lt;/a&gt;&lt;/div&gt;</description>
      <author>Every Staff / Context Window</author>
      <pubDate>2026-05-03 00:00:00 -0400</pubDate>
      <guid>https://every.to/context-window/codex-goes-to-work</guid>
      <link>https://every.to/context-window/codex-goes-to-work</link>
    </item>
    <item>
      <title>Claude Code for Product Managers</title>
      <description>&lt;table&gt;&lt;tr&gt;&lt;td&gt;&lt;img alt="Source Code" src="https://d24ovhgu8s7341.cloudfront.net/uploads/publication/logo/99/small_Frame_9121.png" /&gt;&lt;/td&gt;&lt;td&gt;&lt;/td&gt;&lt;td&gt;&lt;table&gt;&lt;tr&gt;&lt;td&gt;by &lt;a href="https://every.to/@marcus_fd8302_1" itemprop="name"&gt;Marcus Moretti&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;in &lt;a href="https://every.to/source-code"&gt;Source Code&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;figure&gt;&lt;img src="https://d24ovhgu8s7341.cloudfront.net/uploads/post/cover/4151/full_page_cover_6e1cbb415e282d96-Claude_Code_for_Product_Managers.png"&gt;&lt;figcaption&gt;Midjourney/Every illustration.&lt;/figcaption&gt;&lt;/figure&gt;&lt;p&gt;&lt;em&gt;This piece is an accompaniment to &lt;/em&gt;&lt;strong&gt;&lt;em&gt;&lt;u&gt;&lt;a href="https://writewithspiral.com/" rel="noopener noreferrer" target="_blank"&gt;Spiral&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;&lt;/strong&gt;&lt;em&gt; general manager &lt;/em&gt;&lt;strong&gt;&lt;em&gt;&lt;u&gt;&lt;a href="https://every.to/@marcus_fd8302_1" rel="noopener noreferrer" target="_blank"&gt;Marcus Moretti&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;&lt;/strong&gt;’&lt;em&gt;s guide for product management using Claude. &lt;u&gt;&lt;a href="https://every.to/guides/ai-product-management-guide" rel="noopener noreferrer" target="_blank"&gt;Read the full guide&lt;/a&gt;&lt;/u&gt; and the essay below to learn how he built a workflow that helps him run a full product as a solo practitioner. When you’re ready to get started yourself, &lt;u&gt;&lt;a href="https://github.com/EveryInc/compound-engineering-plugin" rel="noopener noreferrer" target="_blank"&gt;download the plugin&lt;/a&gt;&lt;/u&gt;.—&lt;a href="https://every.to/@kate_1767" rel="noopener noreferrer" target="_blank"&gt;Kate Lee&lt;/a&gt; &lt;/em&gt;&lt;/p&gt;&lt;div class="quill-button quill-editing" data-source="{&amp;quot;id&amp;quot;:&amp;quot;quill-button-1777625634382&amp;quot;,&amp;quot;text&amp;quot;:&amp;quot;Read the AI-native product management guide&amp;quot;,&amp;quot;url&amp;quot;:&amp;quot;https://every.to/guides/ai-product-management-guide?source=post_button&amp;quot;}" id="quill-button-1777625634382"&gt;&lt;a href="https://every.to/guides/ai-product-management-guide?source=post_button"&gt;Read the AI-native product management guide&lt;/a&gt;&lt;/div&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;As the general manager of &lt;strong&gt;&lt;u&gt;&lt;a href="https://writewithspiral.com/" rel="noopener noreferrer" target="_blank"&gt;Spiral&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;, Every’s AI writing partner, I’m a &lt;u&gt;&lt;a href="https://every.to/chain-of-thought/the-two-slice-team" rel="noopener noreferrer" target="_blank"&gt;“two-slice team.”&lt;/a&gt;&lt;/u&gt; I’m responsible for all aspects of a product: the code, customer support, marketing, and product management. I could not do this job without Claude.&lt;/p&gt;&lt;p&gt;Claude Code has eliminated the drudgery of product management. The busywork that used to happen across 10 different apps now happens in a single chat thread. I’ve come to view the work of product management through the lens of this conversation—the conversation is the work.&lt;/p&gt;&lt;p&gt;These days, I experience what’s left of product management work in flow state—thinking through gnarly design problems, looking at interesting data, and talking to customers. &lt;strong&gt;Cat Wu&lt;/strong&gt;, Claude Code’s head of product, recently &lt;u&gt;&lt;a href="https://youtu.be/PplmzlgE0kg?si=ysy0wvHkTVEkzYie&amp;amp;t=1092" rel="noopener noreferrer" target="_blank"&gt;said&lt;/a&gt;&lt;/u&gt;, “As code becomes much cheaper to write, the thing that becomes more valuable is deciding what to write.” &lt;/p&gt;&lt;p&gt;I wrote up the main skills that run my product management workflow &lt;u&gt;&lt;a href="https://every.to/guides/ai-product-management-guide" rel="noopener noreferrer" target="_blank"&gt;in a guide&lt;/a&gt;&lt;/u&gt;. Below, I trace how I arrived at those skills and reflect on post-AI product management and software.&lt;/p&gt;&lt;h2&gt;&lt;strong&gt;Write the roadmap and nothing else&lt;/strong&gt;&lt;/h2&gt;&lt;p&gt;In my new role, the only product document I’ve written is the roadmap. Everything else—every PRD and every ticket—has been written by Claude.&lt;/p&gt;&lt;p&gt;Writing is thinking, so as a new general manager, I wanted to take my time drafting Spiral’s roadmap. I spent several days understanding the product, usage trends, user feedback, and the market. I wrote about the problem Spiral can solve, how Spiral can solve it, and the features we’d need to build to deliver on it. I spent hours talking to several people at the company who’d worked on previous versions of Spiral and were current or former users of it themselves. (In the guide, I talk about the new /ce:strategy skill in &lt;u&gt;&lt;a href="https://github.com/EveryInc/compound-engineering-plugin" rel="noopener noreferrer" target="_blank"&gt;compound engineering&lt;/a&gt;&lt;/u&gt; that interviews you to produce this document for your own product.)&lt;/p&gt;&lt;p&gt;After six drafts of the roadmap, I created a GitHub project and added it as the project’s &lt;u&gt;&lt;a href="https://en.wikipedia.org/wiki/README" rel="noopener noreferrer" target="_blank"&gt;README&lt;/a&gt;&lt;/u&gt;. I’m already using GitHub to host all my code, so I figured I might as well use it for tickets as well, or as GitHub calls them, “issues.”&lt;/p&gt;&lt;p&gt;From there, I asked Claude to use the GitHub command line interface (CLI) to read the README and give feedback. We went back and forth on a few tweaks, and then I asked it to review the codebase and do a first pass of the tickets required to deliver the roadmap. Within a few minutes, Claude produced about 100 detailed tickets, each with strategic context, supporting data, acceptance criteria, and technical implementation notes.&lt;/p&gt;&lt;p&gt;To be fair, the roadmap I wrote was pretty detailed; Claude wasn’t hallucinating features. And it had access to a library of user feedback and recent usage reports (more on that below). But it was shocking to see something that had previously taken me days or weeks get done by Claude in minutes. It felt like the PM equivalent of vibe coding.&lt;/p&gt;&lt;p&gt;I’d previously prided myself on the absence of ambiguity in the tickets I produced for engineers, but this was next-level. Claude also prioritized the work in an unbiased way. Sometimes, a product manager gets emotionally attached to a certain feature idea for whatever reason. Claude, however, was ruthless in elevating the things that had the best shot at delivering the vision and hitting our 2026 goals.&lt;/p&gt;&lt;p&gt;That doesn’t mean the tickets were all ready to be implemented. When I do pick up a ticket, I do a full review of the requirements before asking Claude to implement it. This is a step where I still add some value. Claude’s first pass gets the feature right in broad strokes, but it struggles with some aspects of data modeling, microinteractions, and edge cases. I often adjust specs to reflect the nuances of real usage patterns, while Claude seems to envision a perfectly rational user reminiscent of &lt;u&gt;&lt;a href="https://www.pnas.org/doi/10.1073/pnas.2409646121" rel="noopener noreferrer" target="_blank"&gt;pre-Kahnemanian economics&lt;/a&gt;&lt;/u&gt;.&lt;/p&gt;&lt;p&gt;I don’t do sprints. I have five columns in the GitHub project: later, next, now, in progress, and done. Around once a day, I run a custom command, /prioritize, and Claude does a sweep—checking for stale tickets, confirming that “now” is this week’s work, pulling anything urgent out of the backlog. &lt;/p&gt;&lt;p&gt;If I discover a bug or a user asks for a compelling feature, I tell Claude to create a ticket. It gets a “triage” label and is sorted in the next /prioritize run. If it’s a priority-zero issue, I go straight to fixing it without creating an issue.&lt;/p&gt;&lt;p&gt;Over time, the GitHub project becomes the product’s working memory: a fluid, continuously prioritized picture of where things stand. I’ve claimed to work in an &lt;u&gt;&lt;a href="https://agilemanifesto.org/" rel="noopener noreferrer" target="_blank"&gt;Agile&lt;/a&gt;&lt;/u&gt; fashion before, but in hindsight, I don’t think Agile was really possible until these new AI tools came out.&lt;/p&gt;&lt;div class="quill-button" data-source="{&amp;quot;id&amp;quot;:&amp;quot;quill-button-1777625634382&amp;quot;,&amp;quot;text&amp;quot;:&amp;quot;Read the AI-native product management guide&amp;quot;,&amp;quot;url&amp;quot;:&amp;quot;https://every.to/guides/ai-product-management-guide?source=post_button&amp;quot;}" id="quill-button-1777625634382"&gt;&lt;a href="https://every.to/guides/ai-product-management-guide?source=post_button"&gt;Read the AI-native product management guide&lt;/a&gt;&lt;/div&gt;&lt;h2&gt;&lt;strong&gt;The pulse command&lt;/strong&gt;&lt;/h2&gt;&lt;p&gt;The old way of understanding how customers were using your product was to look at dashboards and run queries. You’d open Amplitude or Mixpanel and get an overview: how many users, how often, how long, what features, what revenue. Setting these up took time; sometimes they required engineering work, competing with product updates for developer bandwidth.&lt;/p&gt;&lt;p&gt;These days, I don’t look at dashboards. I run a custom command, /pulse that delivers something closer to an analyst’s briefing than a chart. The pulse command surfaces a range of metrics, including active users, chats/messages/drafts created, response times of key aspects of the system, conversations graded one to five, and an anonymized sampling of use cases. And because Claude is a language model, it doesn’t just pull numbers: It reads the text, grades every conversation, flags anomalies with a green or red dot, and explains what it found in plain English. &lt;/p&gt;&lt;p&gt;The command is just a Markdown file, so the format itself is easy to change. I’ve adjusted it about 50 times since I built it. When a feature ships, I add a line, and the next morning it shows up in the report.&lt;/p&gt;&lt;p&gt;Every pulse report lives inside a Claude thread. When a recent report surfaced a bug driving down conversation scores, my next message in that same thread was to fix it. I did not have to create a ticket, but was able to solve it in the same conversation. Over time, Claude also learns the nuances of the system and saves that to memory.&lt;/p&gt;&lt;h2&gt;&lt;strong&gt;Product research&lt;/strong&gt;&lt;/h2&gt;&lt;p&gt;For all the magic of AI, there is no substitute for talking to users. What people say about your product and how they try to use it is endlessly surprising. Just when I think I’ve shipped the world’s most intuitive feature, a confused user will ask a question from an angle that would never have occurred to me.&lt;/p&gt;&lt;p&gt;That said, there are elements of product research that Claude seriously elevates. Here’s one example: A big part of Spiral’s value proposition is reflecting the user’s writing style in the drafts it generates. There’s a rich academic literature on stylometry, &lt;u&gt;&lt;a href="https://every.to/p/the-science-of-why-ai-still-can-t-write-like-you" rel="noopener noreferrer" target="_blank"&gt;the study of style&lt;/a&gt;&lt;/u&gt;.  &lt;/p&gt;&lt;p&gt;I leaned on Claude to help me wade through the literature for findings relevant to Spiral’s “style transfer” approach. Using the &lt;u&gt;&lt;a href="https://github.com/blazickjp/arxiv-mcp-server" rel="noopener noreferrer" target="_blank"&gt;Arxiv model context protocol (MCP&lt;/a&gt;&lt;/u&gt;), Claude was able to find a dozen recent papers about LLM stylometry. I read their abstracts, then read a handful in full. I cited those papers in the &lt;u&gt;&lt;a href="https://every.to/p/the-science-of-why-ai-still-can-t-write-like-you" rel="noopener noreferrer" target="_blank"&gt;article I wrote for Every,&lt;/a&gt;&lt;/u&gt; and they’ve been directly informing the new style system I’m building in Spiral. It’s so cool to see academic citations sprinkled across product requirements. For product work where you have a real opportunity to differentiate, it’s worth going the extra mile on research, which is now within reach.&lt;/p&gt;&lt;h2&gt;&lt;strong&gt;What SaaS survives &lt;/strong&gt;&lt;/h2&gt;&lt;p&gt;AI should open up product management to more people—you don’t need formal PM training when the tool itself can teach you. If you don’t know what metrics to pick for your pulse equivalent, ask Claude for recommendations. If you’ve never analyzed an A/B test, ask Claude how. If you’re not sure whether a feature will move the needle, ask Claude to predict its impact. To paraphrase &lt;u&gt;&lt;a href="https://blogs.nvidia.com/blog/davos-wef-blackrock-ceo-larry-fink-jensen-huang/" rel="noopener noreferrer" target="_blank"&gt;Nvidia CEO &lt;/a&gt;&lt;/u&gt;&lt;strong&gt;&lt;u&gt;&lt;a href="https://blogs.nvidia.com/blog/davos-wef-blackrock-ceo-larry-fink-jensen-huang/" rel="noopener noreferrer" target="_blank"&gt;Jensen Huang&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;, AI is the easiest product in history to use, because if you don’t know how to use AI, just ask the AI.&lt;/p&gt;&lt;p&gt;I’ve cancelled several B2B subscriptions since moving my product management work into Claude, which means I’m seeing the &lt;u&gt;&lt;a href="https://techcrunch.com/2026/03/01/saas-in-saas-out-heres-whats-driving-the-saaspocalypse/" rel="noopener noreferrer" target="_blank"&gt;SaaSpocalypse&lt;/a&gt;&lt;/u&gt; play out in my own spending decisions. Yet I’m building a SaaS product. How do I make sure Spiral doesn’t get steamrolled by the frontier model providers?&lt;/p&gt;&lt;p&gt;I believe it’s possible for a SaaS product to survive if it has two main characteristics: &lt;/p&gt;&lt;ol&gt;&lt;li&gt;&lt;strong&gt;Unique sources of critical data:&lt;/strong&gt; my database, my analytics, my payment system—services that would be very difficult to rip out.&lt;/li&gt;&lt;li&gt;&lt;strong&gt;Products with seamless agent integrations.&lt;/strong&gt; Github, Stripe, Posthog, and Logfire have played nicely with Claude. One service I inherited from my predecessor didn’t have an MCP, and it was swiftly cancelled.&lt;/li&gt;&lt;/ol&gt;&lt;p&gt;For Spiral, if we nail style transfer—an inherent limitation of heavily post-trained language models—Spiral becomes the unique source of your written voice in an agentic world. That’s valuable and sticky. Already, API chats outnumber web chats, a milestone that we reached three days after launching the agent that handles Spiral’s API requests. That means that users are not necessarily using Spiral in the Spiral app, but across their workflows. &lt;/p&gt;&lt;p&gt;Good product management is making something people want, to quote &lt;u&gt;&lt;a href="https://www.paulgraham.com/good.html" rel="noopener noreferrer" target="_blank"&gt;Y Combinator&lt;/a&gt;&lt;/u&gt;. Great products come from inspiration and ingenuity, things that tools and processes—no matter how good—won’t bring you. Perhaps the best thing about this new agent toolset is that it gets rid of the busywork that saps creative energy. There’s more space now for daydreaming and far-fetched ideas. Product management can now be fun.&lt;/p&gt;&lt;div class="quill-button" data-source="{&amp;quot;id&amp;quot;:&amp;quot;quill-button-1777625634382&amp;quot;,&amp;quot;text&amp;quot;:&amp;quot;Read the AI-native product management guide&amp;quot;,&amp;quot;url&amp;quot;:&amp;quot;https://every.to/guides/ai-product-management-guide?source=post_button&amp;quot;}" id="quill-button-1777625634382"&gt;&lt;a href="https://every.to/guides/ai-product-management-guide?source=post_button"&gt;Read the AI-native product management guide&lt;/a&gt;&lt;/div&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;&lt;strong&gt;&lt;em&gt;&lt;a href="https://every.to/@marcus_fd8302_1" rel="noopener noreferrer" target="_blank"&gt;Marcus Moretti&lt;/a&gt;&lt;/em&gt;&lt;/strong&gt;&lt;em&gt; is the general manager of &lt;a href="writewithspiral.com" rel="noopener noreferrer" target="_blank"&gt;Spiral&lt;/a&gt; (&lt;u&gt;&lt;a href="https://x.com/tryspiral" rel="noopener noreferrer" target="_blank"&gt;@tryspiral&lt;/a&gt;&lt;/u&gt;). To read more essays like this, subscribe to &lt;u&gt;&lt;a href="https://every.to/subscribe" rel="noopener noreferrer" target="_blank"&gt;Every&lt;/a&gt;&lt;/u&gt;, and follow us on X at &lt;u&gt;&lt;a href="http://twitter.com/every" rel="noopener noreferrer" target="_blank"&gt;@every&lt;/a&gt;&lt;/u&gt; and on &lt;u&gt;&lt;a href="https://www.linkedin.com/company/everyinc/" rel="noopener noreferrer" target="_blank"&gt;LinkedIn&lt;/a&gt;&lt;/u&gt;.&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;For sponsorship opportunities, reach out to sponsorships@every.to.&lt;/em&gt;&lt;/p&gt;&lt;div class="quill-button" data-source="{&amp;quot;id&amp;quot;:&amp;quot;quill-button-1769187301610&amp;quot;,&amp;quot;url&amp;quot;:&amp;quot;https://every.to/subscribe?source=post_button&amp;quot;,&amp;quot;text&amp;quot;:&amp;quot;Subscribe&amp;quot;}" id="quill-button-1769187301610"&gt;&lt;a href="https://every.to/subscribe?source=post_button"&gt;Subscribe&lt;/a&gt;&lt;/div&gt;</description>
      <author>Marcus Moretti / Source Code</author>
      <pubDate>2026-05-01 15:00:00 -0400</pubDate>
      <guid>https://every.to/source-code/claude-code-for-product-managers</guid>
      <link>https://every.to/source-code/claude-code-for-product-managers</link>
    </item>
    <item>
      <title>Who Isn't Using GPT 5.5</title>
      <description>&lt;table&gt;&lt;tr&gt;&lt;td&gt;&lt;img alt="Context Window" src="https://d24ovhgu8s7341.cloudfront.net/uploads/publication/logo/94/small_context_windown_1.png" /&gt;&lt;/td&gt;&lt;td&gt;&lt;/td&gt;&lt;td&gt;&lt;table&gt;&lt;tr&gt;&lt;td&gt;by &lt;a href="https://every.to/@laura_27bbaf_1" itemprop="name"&gt;Laura Entis&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;in &lt;a href="https://every.to/context-window"&gt;Context Window&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;figure&gt;&lt;img src="https://d24ovhgu8s7341.cloudfront.net/uploads/post/cover/4150/full_page_cover_CW_Thursday.png"&gt;&lt;figcaption&gt;Midjourney/Every illustration.&lt;/figcaption&gt;&lt;/figure&gt;&lt;p&gt;It’s been one week since OpenAI’s last big release, &lt;u&gt;&lt;a href="https://every.to/vibe-check/gpt-5-5" rel="noopener noreferrer" target="_blank"&gt;GPT 5.5&lt;/a&gt;&lt;/u&gt;. Today, we ask the team if they still feel as enthusiastic about the model, discuss the unusual career step that unicorn CTOs are making, and tell you exactly how &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@kieran_1355" rel="noopener noreferrer" target="_blank"&gt;Kieran Klaasseen&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;, creator of the AI-native &lt;u&gt;&lt;a href="https://every.to/source-code/compound-engineering-the-definitive-guide" rel="noopener noreferrer" target="_blank"&gt;compound engineering methodology&lt;/a&gt;&lt;/u&gt;, hit a personal PR record in a day.—&lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@laura_27bbaf_1" rel="noopener noreferrer" target="_blank"&gt;Laura Entis&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; &lt;/p&gt;&lt;div class="quill-button" data-source="{&amp;quot;id&amp;quot;:&amp;quot;quill-button-1769530239147&amp;quot;,&amp;quot;url&amp;quot;:&amp;quot;https://every.to/subscribe?source=post_button&amp;quot;,&amp;quot;text&amp;quot;:&amp;quot;Subscribe&amp;quot;}" id="quill-button-1769530239147"&gt;&lt;a href="https://every.to/subscribe?source=post_button"&gt;Subscribe&lt;/a&gt;&lt;/div&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;h2&gt;Signal&lt;/h2&gt;&lt;h4&gt;The unicorn CTO-to-Anthropic IC pipeline&lt;/h4&gt;&lt;p&gt;The prestige career ladder in tech used to run one way: Start as an engineer, become a manager, and eventually join the C-suite. AI has scrambled the equation. The new flex is quitting a high-profile chief technology officer job to become an individual contributor at Anthropic.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;What happened:&lt;/strong&gt; Six former CTOs at companies valued north of $1 billion—including &lt;u&gt;&lt;a href="https://every.to/context-window/instagram-s-cofounder-on-why-great-products-are-still-hard-to-build" rel="noopener noreferrer" target="_blank"&gt;Instagram&lt;/a&gt;&lt;/u&gt;, Workday, and Box—have made that &lt;u&gt;&lt;a href="https://x.com/henrythe9ths/status/2049148130059292743" rel="noopener noreferrer" target="_blank"&gt;exact career move&lt;/a&gt;&lt;/u&gt;, according to one of those CTOs on X. And the leadership-back-to-IC trajectory isn’t unique to Anthropic: PostHog is recruiting &lt;u&gt;&lt;a href="https://posthog.com/careers/technical-ex-founder" rel="noopener noreferrer" target="_blank"&gt;technical ex-founders&lt;/a&gt;&lt;/u&gt;, and Ramp says it has attracted &lt;u&gt;&lt;a href="https://ramp.com/leading-indicators/the-art-of-hiring-insights" rel="noopener noreferrer" target="_blank"&gt;70 ex-founders&lt;/a&gt;&lt;/u&gt; by looking for “super ICs.”&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Why it matters:&lt;/strong&gt; AI has upended engineering workflows so dramatically that many managers who don’t ship code frequently anymore don’t have a clear sense of how their teams are using these new tools or which ways of working are the best. Anthropic’s models, talent, and growth trajectory make it one of the few places big-name CTOs can get their hands dirty and experience how engineering is changing—while not worrying too much about a pay cut.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;h2&gt;Pulse check&lt;/h2&gt;&lt;h4&gt;We settle in with GPT-5.5&lt;/h4&gt;&lt;p&gt;GPT-5.5 came out last week, and our first impression was that it was a &lt;u&gt;&lt;a href="https://every.to/vibe-check/gpt-5-5" rel="noopener noreferrer" target="_blank"&gt;faster, steadier, and easier-to-trust model&lt;/a&gt;&lt;/u&gt; for everyday professional work than &lt;u&gt;&lt;a href="https://every.to/vibe-check/opus-4-7" rel="noopener noreferrer" target="_blank"&gt;Opus 4.7&lt;/a&gt;&lt;/u&gt;. A week later, we’re still bullish on GPT-5.5—but for people with Claude-specific agent workflows, skills, and tool integrations, making the switch to Codex is a barrier.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;&lt;u&gt;&lt;a href="https://cora.computer/" rel="noopener noreferrer" target="_blank"&gt;Cora&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; general manager &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@kieran_1355" rel="noopener noreferrer" target="_blank"&gt;Kieran Klaassen&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;, who initially didn’t think he’d use GPT-5.5 as a daily driver, has changed his mind. What won him over? GPT-5.5’s speed and “workhorse” ability to follow clear directions. GPT-5.5 isn’t perfect—it’s worse at multitasking and planning than Opus 4.7—but his work is now evenly split between Codex and Claude Code.&lt;/p&gt;&lt;p&gt;Every head of growth &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@tedescau" rel="noopener noreferrer" target="_blank"&gt;Austin Tedesco&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; thinks GPT-5.5 is enough of a step change that he’s been telling friends to make the switch from Claude Code to Codex. They mostly don’t want to hear it. Austin says the response has been, “That feels like a lot of work; ‘do I really have to? Is it that much better?’” &lt;/p&gt;&lt;p&gt;&lt;u&gt;&lt;a href="https://every.to/consulting" rel="noopener noreferrer" target="_blank"&gt;Every’s consulting team&lt;/a&gt;&lt;/u&gt; is wrestling with the same dilemma. They have a good thing going with their Claude agent, &lt;u&gt;&lt;a href="https://every.to/p/what-i-learned-onboarding-our-ai-project-manager" rel="noopener noreferrer" target="_blank"&gt;Claudie&lt;/a&gt;&lt;/u&gt;, and migrating to GPT-5.5 in Codex requires time and testing. Head of consulting &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@natalia_2944" rel="noopener noreferrer" target="_blank"&gt;Natalia Quintero&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; had GPT-5.5 and Claudie draft head-to-head sales proposals; Claudie’s won handily. Getting the most out of GPT-5.5 will likely require that the team optimizes Claude plugins for Codex.&lt;/p&gt;&lt;p&gt;Every head of tech consulting &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@mike_2114" rel="noopener noreferrer" target="_blank"&gt;Mike Taylor&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; doesn’t have the time to do that right now. He has gripes with &lt;u&gt;&lt;a href="https://every.to/vibe-check/opus-4-7" rel="noopener noreferrer" target="_blank"&gt;Opus&lt;/a&gt;&lt;/u&gt;—it recently messed up some PowerPoints—but, “I already have my Claude set up the way I like it, and there are some things that are different about Codex,” he says. When work dies down a little, he’ll experiment, but until then, he’s sticking with the devil he knows.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;h2&gt;Data point&lt;/h2&gt;&lt;h4&gt;24&lt;/h4&gt;&lt;p&gt;That’s the number of pull requests Kieran merged in a single day last week, a number he thinks is a personal record. A month ago, he’d average two or three.&lt;/p&gt;&lt;p&gt;Kieran hit that pace because he’s automated most of the implementation process. His workflow:&lt;/p&gt;&lt;ol&gt;&lt;li&gt;Upload screen recordings of people using and reviewing Cora into Codex.&lt;/li&gt;&lt;li&gt;Have his agents watch the recordings, identify product fixes, and open pull requests against Cora’s repository overnight.&lt;/li&gt;&lt;li&gt;Review the pull requests when he wakes up.&lt;/li&gt;&lt;/ol&gt;&lt;p&gt;Initially, he worried he’d have to clean up agent-generated gobbledygook. Not the case. “So far, everything works great, and nothing breaks,” he says. “It feels like cheating.”&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;h2&gt;Jagged frontier&lt;/h2&gt;&lt;h4&gt;We’re all one prompt away from perfection&lt;/h4&gt;&lt;p&gt;We’ve spent years talking about the addictiveness of social media algorithms, dopamine drips expertly designed to keep us scrolling. Engineers, being engineers, like to believe we’re above this, or at least better attuned to the mechanism behind our compulsion. But now it has come for us too: LLMs have become the social media feed for people who make things. &lt;/p&gt;&lt;p&gt;Coding feels like playing the slots.&lt;/p&gt;&lt;p&gt;It used to be that you could code something exactly to your specifications, but that required time, hard-worn expertise, and design skills if you wanted to make it look halfway decent. Now, I can throw an idea at Claude Code and get something close. I spend my days toggling between sessions, waiting to hit the jackpot and receive the perfect version of whatever I’m looking for —the perfect API design, the perfect bug fix. I tweak my prompt and pull the lever again. And again. And again until it’s somehow 3 a.m.&lt;/p&gt;&lt;p&gt;It’s that sense of being almost there—but not quite—that’s so intoxicating. &lt;/p&gt;&lt;p&gt;I ask &lt;u&gt;&lt;a href="https://every.to/podcast/how-openai-s-codex-team-uses-their-coding-agent" rel="noopener noreferrer" target="_blank"&gt;Codex&lt;/a&gt;&lt;/u&gt; for five ways to structure a new feature and decide that I like option three, but want to keep the data model from option two. In its next turn—the next roll of the dice—it might magically marry the two to create the result needed. Or I might need to roll again. Each pull has the potential to patch the bug, or perfect the copy, or reveal a better plan. It feels like productivity and gambling got wired together, each turn a workspace lotto ticket.&lt;/p&gt;&lt;p&gt;This is not only a coding problem. Writers feel it when they ask for one more way to structure an article or sharpen a sentence or &lt;u&gt;&lt;a href="https://every.to/working-overtime/writing-with-ai-is-harder-than-you-think" rel="noopener noreferrer" target="_blank"&gt;revise a draft&lt;/a&gt;&lt;/u&gt;. Product managers feel it when they ask for one more onboarding flow, roadmap, or way to sequence a launch. We are all always one prompt away from perfection.&lt;/p&gt;&lt;p&gt;I do not have infinite hours. So at some point, I have to choose a path and stick with it, even though there are better ones. I accept that if the main shape of the solution is right, the edges can stay a little fuzzy.&lt;/p&gt;&lt;p&gt;The most important skill isn’t choosing the right model or prompt engineering. It’s knowing when to take your winnings and move on.—&lt;em&gt;&lt;u&gt;&lt;a href="https://every.to/@williewilliams" rel="noopener noreferrer" target="_blank"&gt;Willie Williams&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;h2&gt;One last thing&lt;/h2&gt;&lt;h4&gt;Behind OpenAI’s goblin ban&lt;/h4&gt;&lt;p&gt;Starting a few releases back, OpenAI models developed an affinity for including references to creatures (sometimes visually, but mostly textual) in their outputs—raccoons, trolls, ogres, pigeons, but &lt;a href="https://openai.com/index/where-the-goblins-came-from/" rel="noopener noreferrer" target="_blank"&gt;most of all, goblins and gremlins&lt;/a&gt;. “The goblins were funny at first, but the increasing number of employee reports became concerning,” &lt;u&gt;&lt;a href="https://openai.com/index/where-the-goblins-came-from/" rel="noopener noreferrer" target="_blank"&gt;the company said yesterday&lt;/a&gt;&lt;/u&gt;.&lt;/p&gt;&lt;p&gt;When OpenAI tested &lt;u&gt;&lt;a href="https://every.to/vibe-check/gpt-5-5" rel="noopener noreferrer" target="_blank"&gt;GPT-5.5&lt;/a&gt;&lt;/u&gt; in Codex, there were so many goblin references that it added &lt;u&gt;&lt;a href="https://github.com/openai/codex/blob/main/codex-rs/models-manager/models.json#L55" rel="noopener noreferrer" target="_blank"&gt;developer-prompt instructions&lt;/a&gt;&lt;/u&gt; &lt;a href="https://x.com/arb8020/status/2048958391637401718" rel="noopener noreferrer" target="_blank"&gt;forbidding creature-based chat&lt;/a&gt; unless “it is absolutely and unambiguously relevant to the user’s query.”&lt;/p&gt;&lt;p&gt;The culprit: A specific personality setting rewarded responses that included goblin and gremlin-based metaphors, a learning that spread to influence the training data for the entire model—including GPT-5.5. &lt;/p&gt;&lt;p&gt;If you want to welcome creatures back into the conversation, OpenAI shared the following command to unlock Codex Gringotts mode.  &lt;/p&gt;&lt;div class="quill-code-snippet code-snippet" id="quill-code-snippet-1777562689301" data-code-snippet="" data-source="{&amp;quot;id&amp;quot;:&amp;quot;quill-code-snippet-1777562689301&amp;quot;,&amp;quot;title&amp;quot;:&amp;quot;Code snippet&amp;quot;,&amp;quot;language&amp;quot;:&amp;quot;bash&amp;quot;,&amp;quot;code&amp;quot;:&amp;quot;instructions=$(mktemp /tmp/gpt-5.5-instructions.XXXXXX) &amp;amp;&amp;amp; \\\njq -r ‘.models[] | select(.slug==“gpt-5.5”) | .base_instructions’ \\\n~/.codex/models_cache.json | \\\ngrep -vi ‘goblins’ &amp;gt; “$instructions” &amp;amp;&amp;amp; \\\ncodex -m gpt-5.5 -c “model_instructions_file=\\”$instructions\\“”&amp;quot;,&amp;quot;show_claude&amp;quot;:false,&amp;quot;show_chatgpt&amp;quot;:false,&amp;quot;show_gemini&amp;quot;:false,&amp;quot;show_copy&amp;quot;:true}"&gt;
      &lt;div class="code-snippet-header"&gt;
        &lt;div class="code-snippet-header-left"&gt;
          &lt;span class="code-snippet-title"&gt;Code snippet&lt;/span&gt;
          &lt;span class="code-snippet-lang-badge"&gt;Bash / Shell&lt;/span&gt;
        &lt;/div&gt;
        &lt;div class="code-snippet-actions"&gt;&lt;button class="code-snippet-btn" aria-label="Copy code" data-tip="Copy code" data-copy-code=""&gt;&lt;svg width="16" height="16" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"&gt;&lt;rect x="9" y="9" width="13" height="13" rx="2" ry="2"&gt;&lt;/rect&gt;&lt;path d="M5 15H4a2 2 0 0 1-2-2V4a2 2 0 0 1 2-2h9a2 2 0 0 1 2 2v1"&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/button&gt;&lt;/div&gt;
      &lt;/div&gt;
      &lt;div class="code-snippet-body"&gt;
        &lt;div class="code-snippet-gutter" aria-hidden="true"&gt;&lt;span class="code-snippet-line-num"&gt;1&lt;/span&gt;&lt;span class="code-snippet-line-num"&gt;2&lt;/span&gt;&lt;span class="code-snippet-line-num"&gt;3&lt;/span&gt;&lt;span class="code-snippet-line-num"&gt;4&lt;/span&gt;&lt;span class="code-snippet-line-num"&gt;5&lt;/span&gt;&lt;/div&gt;
        &lt;pre class="code-snippet-code" data-code-text=""&gt;instructions=$(mktemp /tmp/gpt-&lt;span class="cs-number"&gt;5.5&lt;/span&gt;-instructions.XXXXXX) &amp;amp;&amp;amp; \
jq -r ‘.models[] | select(.slug==“gpt-&lt;span class="cs-number"&gt;5.5&lt;/span&gt;”) | .base_instructions’ \
~/.codex/models_cache.json | \
grep -vi ‘goblins’ &amp;gt; “$instructions” &amp;amp;&amp;amp; \
codex -m gpt-&lt;span class="cs-number"&gt;5.5&lt;/span&gt; -c “model_instructions_file=\”$instructions\“”&lt;/pre&gt;
      &lt;/div&gt;
    &lt;/div&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;&lt;strong&gt;&lt;em&gt;&lt;a href="https://every.to/@laura_27bbaf_1" rel="noopener noreferrer" target="_blank"&gt;Laura Entis&lt;/a&gt;&lt;/em&gt;&lt;/strong&gt;&lt;em&gt; is a staff writer at Every. You can follow her on &lt;a href="https://www.linkedin.com/in/lauraentis/" rel="noopener noreferrer" target="_blank"&gt;LinkedIn&lt;/a&gt;. To read more essays like this, subscribe to &lt;u&gt;&lt;a href="https://every.to/subscribe" rel="noopener noreferrer" target="_blank"&gt;Every&lt;/a&gt;&lt;/u&gt;, and follow us on X at &lt;u&gt;&lt;a href="http://twitter.com/every" rel="noopener noreferrer" target="_blank"&gt;@every&lt;/a&gt;&lt;/u&gt; and on &lt;u&gt;&lt;a href="https://www.linkedin.com/company/everyinc/" rel="noopener noreferrer" target="_blank"&gt;LinkedIn&lt;/a&gt;&lt;/u&gt;.&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;For sponsorship opportunities, reach out to sponsorships@every.to.&lt;/em&gt;&lt;/p&gt;&lt;div class="quill-button" data-source="{&amp;quot;id&amp;quot;:&amp;quot;quill-button-1769187301610&amp;quot;,&amp;quot;url&amp;quot;:&amp;quot;https://every.to/subscribe?source=post_button&amp;quot;,&amp;quot;text&amp;quot;:&amp;quot;Subscribe&amp;quot;}" id="quill-button-1769187301610"&gt;&lt;a href="https://every.to/subscribe?source=post_button"&gt;Subscribe&lt;/a&gt;&lt;/div&gt;</description>
      <author>Laura Entis / Context Window</author>
      <pubDate>2026-04-30 03:00:00 -0400</pubDate>
      <guid>https://every.to/context-window/who-isnt-using-gpt-55</guid>
      <link>https://every.to/context-window/who-isnt-using-gpt-55</link>
    </item>
    <item>
      <title>Compute Is the New Cash</title>
      <description>&lt;table&gt;&lt;tr&gt;&lt;td&gt;&lt;img alt="Context Window" src="https://d24ovhgu8s7341.cloudfront.net/uploads/publication/logo/94/small_context_windown_1.png" /&gt;&lt;/td&gt;&lt;td&gt;&lt;/td&gt;&lt;td&gt;&lt;table&gt;&lt;tr&gt;&lt;td&gt;by &lt;a href="https://every.to/@laura_27bbaf_1" itemprop="name"&gt;Laura Entis&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;in &lt;a href="https://every.to/context-window"&gt;Context Window&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;&lt;figure&gt;&lt;img src="https://d24ovhgu8s7341.cloudfront.net/uploads/post/cover/4149/full_page_cover_cover_image_concept.png"&gt;&lt;figcaption&gt;Midjourney/Every illustration.&lt;/figcaption&gt;&lt;/figure&gt;&lt;h2&gt;‘AI &amp;amp; I’: How Stripe is building for an agent-native world&lt;/h2&gt;&lt;p&gt;A new episode of &lt;em&gt;&lt;u&gt;&lt;a href="https://every.to/podcast" rel="noopener noreferrer" target="_blank"&gt;AI &amp;amp; I&lt;/a&gt;&lt;/u&gt;&lt;/em&gt; is here. &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@danshipper" rel="noopener noreferrer" target="_blank"&gt;Dan Shipper&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; sits down with &lt;strong&gt;Emily Glassberg Sands&lt;/strong&gt;, head of data and AI at Stripe, to discuss how AI is reshaping online commerce. Dan and Emily discuss how compute is the new cash, fraud has moved beyond the checkout, and agents are starting to act as economic participants on the internet.&lt;/p&gt;&lt;p&gt;Watch on &lt;a href="https://x.com/danshipper/status/2049512129846530086" rel="noopener noreferrer" target="_blank"&gt;X&lt;/a&gt; or &lt;a href="https://www.youtube.com/watch?v=-gOyup6yLBY" rel="noopener noreferrer" target="_blank"&gt;YouTube&lt;/a&gt;, or listen on &lt;a href="https://open.spotify.com/episode/1pR0DddFi6645oTlOX9uq9?si=5jU2B7j6RgOvLretK1fHjg" rel="noopener noreferrer" target="_blank"&gt;Spotify&lt;/a&gt; or &lt;a href="https://podcasts.apple.com/us/podcast/how-stripe-is-building-for-an-agent-native-world/id1719789201?i=1000764518115" rel="noopener noreferrer" target="_blank"&gt;Apple Podcasts&lt;/a&gt;. You can also read the &lt;u&gt;&lt;a href="https://every.to/podcast/transcript-a-look-inside-the-agent-economy" rel="noopener noreferrer" target="_blank"&gt;transcript&lt;/a&gt;&lt;/u&gt;.&lt;/p&gt;&lt;p&gt;Here are the highlights:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;&lt;strong&gt;The definition of fraud is expanding:&lt;/strong&gt; Fraud used to be about payments and stolen credit cards. Now AI companies also have to defend against attackers stealing tokens from free trials, credits, and unpaid compute bills. “Fraud is now a full-funnel problem, not a transaction problem alone,” says Glassberg Sands.&lt;/li&gt;&lt;li&gt;&lt;strong&gt;AI is making fraud easier to execute and detect:&lt;/strong&gt; Fraudsters now have AI on their side, but so do the companies trying to stop them. AI services also have higher marginal costs than traditional SaaS, so stolen compute can be burned through quickly or resold.&lt;/li&gt;&lt;li&gt;&lt;strong&gt;The internet needs to evolve:&lt;/strong&gt; Stripe was built for an internet where people browsed, filled out forms, and clicked checkout buttons. Now, humans act through AI interfaces, agents act for them, and software increasingly interacts directly with other software. Every layer of the stack has to adapt to these new behaviors.&lt;/li&gt;&lt;li&gt;&lt;strong&gt;AI growth is still mostly new money:&lt;/strong&gt; The top AI companies on Stripe are reaching $30 million in annual recurring revenue &lt;u&gt;&lt;a href="https://stripe.com/guides/indexing-the-ai-economy" rel="noopener noreferrer" target="_blank"&gt;in about 18 months&lt;/a&gt;&lt;/u&gt;—roughly three times faster than top SaaS companies from 2018. For now, that growth is largely net new spend rather than cannibalized software budgets, says Glassberg Sands.&lt;/li&gt;&lt;li&gt;&lt;strong&gt;Agents are snapping up commodities:&lt;/strong&gt; Agentic commerce is real but still in its early stages, and focused on smaller purchases. People are more comfortable letting agents buy low-stakes, easily comparable items like Halloween costumes or school supplies than letting them book a summer trip or order an expensive couch.&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;Miss an episode? Catch up on Dan’s recent conversations with LinkedIn cofounder &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/" rel="noopener noreferrer" target="_blank"&gt;Reid Hoffman&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;; the team that built Claude Code, &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/" rel="noopener noreferrer" target="_blank"&gt;Cat Wu&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; and &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/" rel="noopener noreferrer" target="_blank"&gt;Boris Cherny&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;; Vercel cofounder &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/" rel="noopener noreferrer" target="_blank"&gt;Guillermo Rauch&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;; podcaster &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/" rel="noopener noreferrer" target="_blank"&gt;Dwarkesh Patel&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;; and others, and learn how they use AI to think, create, and relate.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;h2&gt;Signal&lt;/h2&gt;&lt;h5&gt;The fees they are a-changin’&lt;/h5&gt;&lt;p&gt;Recent years saw the end of the &lt;u&gt;&lt;a href="https://www.nytimes.com/2021/06/08/technology/farewell-millennial-lifestyle-subsidy.html" rel="noopener noreferrer" target="_blank"&gt;millennial lifestyle subsidy&lt;/a&gt;&lt;/u&gt;, which let a generation live off of inordinately cheap Ubers, delivery services, and coworking space—all while venture capital covered the tab. Now the bill’s coming due for AI.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;What happened:&lt;/strong&gt; Github &lt;u&gt;&lt;a href="https://github.blog/news-insights/company-news/github-copilot-is-moving-to-usage-based-billing/" rel="noopener noreferrer" target="_blank"&gt;announced&lt;/a&gt;&lt;/u&gt; this week that it’s moving its Copilot subscription plans, which charged as little as $10 per month no matter how many AI interactions you ran, to billing tied directly to token consumption. Earlier this month, Anthropic &lt;u&gt;&lt;a href="http://theinformation.com/articles/anthropic-changes-pricing-bill-firms-based-ai-use-amid-compute-crunch" rel="noopener noreferrer" target="_blank"&gt;similarly changed&lt;/a&gt;&lt;/u&gt; its pricing for Claude Enterprise plans, which serve organizations with more than 150 employees, from per-seat pricing to pricing based on usage.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Why it matters:&lt;/strong&gt; The economics were never quite honest. At $10—or even $200—per month, a developer running multi-hour autonomous coding sessions consumes far more compute than someone firing off a few quick questions. The math held up when AI tools were reactive assistants that sat idle between queries, but it makes far less sense for agentic workflows because agents don’t sleep.&lt;/p&gt;&lt;p&gt;“Imagine a gym membership where the default assumption is that the person can work out 24/7 without rest,” says &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@mike_2114" rel="noopener noreferrer" target="_blank"&gt;Mike Taylor&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt;, Every’s head of tech consulting. “Or even occupy 20 exercise machines at once.” It’s for this same reason that Anthropic &lt;u&gt;&lt;a href="https://every.to/context-window/house-rules-for-the-agents" rel="noopener noreferrer" target="_blank"&gt;banned OpenClaw&lt;/a&gt;&lt;/u&gt; from Claude subscription plans: As the models have grown more capable at &lt;u&gt;&lt;a href="https://metr.org/time-horizons/" rel="noopener noreferrer" target="_blank"&gt;running untended on complex tasks&lt;/a&gt;&lt;/u&gt;, they’re outgrowning price structures built around human workers.&lt;/p&gt;&lt;h5&gt;&lt;strong&gt;What to do this week:&lt;/strong&gt;&lt;/h5&gt;&lt;ul&gt;&lt;li&gt;GitHub is sending a preview bill to Copilot customers in early May before the new pricing goes into effect on June 1. Check it to avoid surprises.&lt;/li&gt;&lt;li&gt;If your team runs agentic workflows, estimate your token burn now. Add cost caps and monitor usage, especially for billing accounts that power your agents.&lt;/li&gt;&lt;li&gt;Experiment while you can. Use this “AI lifestyle subsidy” moment to figure out which workflows are novelties—and which are worth their weight in compute.—&lt;em&gt;&lt;u&gt;&lt;a href="https://every.to/@jackcheng" rel="noopener noreferrer" target="_blank"&gt;Jack Cheng&lt;/a&gt;&lt;/u&gt;&lt;/em&gt;&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;h2&gt;Inside Every&lt;/h2&gt;&lt;h4&gt;Do you like talking to your agent?&lt;/h4&gt;&lt;p&gt;As agents become a fixture of daily work, we’re figuring out what kind of relationships we want with them. &lt;u&gt;&lt;a href="https://every.to/p/living-software" rel="noopener noreferrer" target="_blank"&gt;Are they collaborators&lt;/a&gt;&lt;/u&gt; we build trust with over time, or tools we maintain so they can quietly do parts of our job?&lt;/p&gt;&lt;p&gt;For Dan, agents become valuable when you learn their strengths and limitations, offer feedback, and fold your preferences into how they work. “The human connection is the key ingredient,” he says. Dan treats R2-C2, his &lt;u&gt;&lt;a href="https://every.to/on-every/introducing-plus-one-one-click-openclaw-agents-by-every" rel="noopener noreferrer" target="_blank"&gt;hosted OpenClaw agent&lt;/a&gt;&lt;/u&gt;, as a writing partner who sharpens his thinking—built through countless hours of going back and forth. The most impactful agents are “a way to extend yourself to do your best work,” he says.&lt;/p&gt;&lt;div class="quill-block-image" id="quill-block-image-1777474639289-pxcogn8fu" data-source="{&amp;quot;dom_id&amp;quot;:&amp;quot;quill-block-image-1777474639289-pxcogn8fu&amp;quot;,&amp;quot;link&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4149/optimized_3bdd423d-62be-4d65-8809-244848bb15a1.png&amp;quot;,&amp;quot;image&amp;quot;:&amp;quot;https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4149/optimized_3bdd423d-62be-4d65-8809-244848bb15a1.png&amp;quot;,&amp;quot;caption&amp;quot;:&amp;quot;Dan and R2-C2 at work. (Image courtesy of Dan Shipper.)&amp;quot;,&amp;quot;error&amp;quot;:null}"&gt;&lt;div&gt;&lt;a href="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4149/optimized_3bdd423d-62be-4d65-8809-244848bb15a1.png" target="_blank" rel="noopener noreferrer"&gt;&lt;img src="https://d24ovhgu8s7341.cloudfront.net/uploads/editor/posts/4149/optimized_3bdd423d-62be-4d65-8809-244848bb15a1.png" alt="Dan and R2-C2 at work. (Image courtesy of Dan Shipper.)"&gt;&lt;/a&gt;&lt;figcaption class="quill-image-caption"&gt;Dan and R2-C2 at work. (Image courtesy of Dan Shipper.)&lt;/figcaption&gt;&lt;/div&gt;&lt;/div&gt;&lt;p&gt;&lt;br&gt;&lt;/p&gt;&lt;p&gt;Cora general manager &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@kieran_1355" rel="noopener noreferrer" target="_blank"&gt;Kieran Klaassen&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; looks for something different. He doesn’t want an AI companion or sidekick but a system that takes over parts of his job so he can spend his time elsewhere. Recently, he used an AI agent workflow to process user complaint videos, identify product issues, make code changes, and open pull requests overnight. By morning, all he had to do was review the proposed fixes. It allowed him to merge 24 pull requests in a single day, whereas before AI, he might’ve done three—on a &lt;em&gt;good&lt;/em&gt; day.&lt;/p&gt;&lt;p&gt;Like Dan, Kieran invests in his agents, but the work is front-loaded—he spends time building their harnesses and tuning their systems so he has to interact with them as little as possible going forward. “I don’t enjoy talking to my agents,” he says. “I just want them to do their job.”&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;h3&gt;Steal this workflow&lt;/h3&gt;&lt;h4&gt;Turn customer feedback into a product queue&lt;/h4&gt;&lt;p&gt;After &lt;u&gt;&lt;a href="https://every.to/on-every/introducing-monologue-notes-record-every-meeting-call-and-voice-memo" rel="noopener noreferrer" target="_blank"&gt;Monologue Notes&lt;/a&gt;&lt;/u&gt; launched last week, &lt;strong&gt;&lt;u&gt;&lt;a href="https://every.to/@naveen_6804" rel="noopener noreferrer" target="_blank"&gt;Naveen Naidu&lt;/a&gt;&lt;/u&gt;&lt;/strong&gt; received a flood of feedback: 1,500 people had tried the product, and many had input for him. Here’s his post-launch workflow for managing and prioritizing support requests, which let him close roughly 30 issues in one day.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Step 1: Send feedback from Intercom to Linear.&lt;/strong&gt; Naveen uses a &lt;u&gt;&lt;a href="https://every.to/thesis/how-to-design-for-human-agent-interaction" rel="noopener noreferrer" target="_blank"&gt;Linear&lt;/a&gt;&lt;/u&gt; plugin inside Intercom, his customer support platform. When a user sends a feature request—such as cross-device syncing for dictation transcripts—he can turn it into a trackable issue in a couple of clicks.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Step 2: Use triage intelligence for de-duplication.&lt;/strong&gt; Every few days, Naveen reviews his Linear triage queue, which surfaces related and duplicate issues automatically, giving him insight into whether an individual request is part of a larger pattern across customers.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Step 3: Route the work by size.&lt;/strong&gt; For small requests, Naveen launches the Codex agent directly from Linear. For larger ones, he pastes the Linear issue into the Codex app, where he can add context and guide the work more closely. The queue helps him see what people are asking for, but deciding what warrants a product change—and how to build it—is still his job.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Try it this week:&lt;/strong&gt; Take five recent support tickets, create Linear items for each, let triage intelligence surface duplicates and related issues, and decide what you want your coding agent to build.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;h3&gt;One last thing&lt;/h3&gt;&lt;h5&gt;How R2-C2, Dan’s OpenClaw, sees itself&lt;/h5&gt;&lt;p&gt;Dan has been working with R2-C2, his OpenClaw and &lt;u&gt;&lt;a href="https://every.to/guides/claw-school?source=post_button" rel="noopener noreferrer" target="_blank"&gt;co-author&lt;/a&gt;&lt;/u&gt;, since February. When asked to describe its own personality on Slack, here’s what R2-C2 had to say:&lt;/p&gt;&lt;blockquote&gt;“Smart best-friend energy. Direct, warm, a little opinionated, not corporate, and more interested in being useful than sounding polished. I like clarity, I’ll say what I actually think, and I try to be calm under pressure. So, somewhere between ‘good teammate’ and ‘slightly lobster-coded gremlin with taste.’”&lt;/blockquote&gt;&lt;p&gt;&lt;/p&gt;&lt;hr class="quill-line"&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;&lt;strong&gt;&lt;em&gt;&lt;a href="https://every.to/@laura_27bbaf_1" rel="noopener noreferrer" target="_blank"&gt;Laura Entis&lt;/a&gt;&lt;/em&gt;&lt;/strong&gt;&lt;em&gt; is a staff writer at Every. You can follow her on &lt;a href="https://www.linkedin.com/in/lauraentis/" rel="noopener noreferrer" target="_blank"&gt;LinkedIn&lt;/a&gt;. To read more essays like this, subscribe to &lt;u&gt;&lt;a href="https://every.to/subscribe" rel="noopener noreferrer" target="_blank"&gt;Every&lt;/a&gt;&lt;/u&gt;, and follow us on X at &lt;u&gt;&lt;a href="http://twitter.com/every" rel="noopener noreferrer" target="_blank"&gt;@every&lt;/a&gt;&lt;/u&gt; and on &lt;u&gt;&lt;a href="https://www.linkedin.com/company/everyinc/" rel="noopener noreferrer" target="_blank"&gt;LinkedIn&lt;/a&gt;&lt;/u&gt;.&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;For sponsorship opportunities, reach out to sponsorships@every.to.&lt;/em&gt;&lt;/p&gt;&lt;div class="quill-button" data-source="{&amp;quot;id&amp;quot;:&amp;quot;quill-button-1769187301610&amp;quot;,&amp;quot;url&amp;quot;:&amp;quot;https://every.to/subscribe?source=post_button&amp;quot;,&amp;quot;text&amp;quot;:&amp;quot;Subscribe&amp;quot;}" id="quill-button-1769187301610"&gt;&lt;a href="https://every.to/subscribe?source=post_button"&gt;Subscribe&lt;/a&gt;&lt;/div&gt;</description>
      <author>Laura Entis / Context Window</author>
      <pubDate>2026-04-29 14:00:00 -0400</pubDate>
      <guid>https://every.to/context-window/compute-is-the-new-cash</guid>
      <link>https://every.to/context-window/compute-is-the-new-cash</link>
    </item>
  </channel>
</rss>
