Transcript: 'Why We Switched From Claude Code to Codex'

The transcript of AI & I with Austin Tedesco is below. Watch on X or YouTube, or listen on Spotify or Apple Podcasts.

Timestamps

How Codex went from a tool for senior engineers to a daily driver for knowledge work: 00:00:57
How Claude Code proved that a great coding agent works for any knowledge work: 00:02:42
Austin’s switch to Codex: 00:07:24
How Austin set up Codex with folders, keys, and reviewer agents: 00:13:48
Using Codex to brainstorm automations across Gmail, Slack, and Notion: 00:18:24
How Austin manages the human review step when Codex is drafting communications: 00:22:42
Using Codex to build specialized agents inspired by product executive Claire Vo: 00:28:54
Synthesizing meeting transcripts and Slack threads into a go-to-market plan: 00:31:09
Building a live KPI tracker in Notion that agents can read: 00:40:15
Using Codex for recruiting: 00:44:54

Transcript

(00:00:00)

Dan Shipper

Codex is one of those things where three months ago, six months ago, it was trash. If anyone from OpenAI is listening to that, I stand by it 100%. If you have a great general-purpose coding agent on your computer, it’s actually really great for any kind of knowledge work. If it can write software on its own, it can do any kind of knowledge work on its own.

Austin

When I sign on during the day, Codex is the first thing I open. It’s pulling in whatever I need from Gmail, Slack, Notion, Stripe – all of our data sources. It’s where I spend about 80% of my time working, overwhelmingly because the app itself is just so good.

Dan Shipper

There’s a new operating system for how and where you’re going to get your work done, and it’s this kind of agent management interface.

(00:00:57)

Hello, everybody. Welcome to Codex Camp – Codex for Knowledge Work. Excited to have you here on this auspicious day, the day after GPT-5.5 released. I’m here with our head of growth, Austin. Austin, say hello.

Austin

Hello.

Dan Shipper

We’re psyched to do this. Codex is one of those things where three months ago, six months ago, it was trash. And if anyone from OpenAI is listening, I stand by that 100%.

It was really built for senior engineers doing pair programming. It would argue with you, make you feel stupid. It had no emotional intelligence.

I think OpenAI had this interesting theory starting with GPT-5 that your vibe coding was going to happen in ChatGPT – that was where all that would live – and senior engineers were going to use Codex to do their programming work, but the model was going to be sandboxed and hobbled so it didn’t do anything bad.

What basically happened is Anthropic figured out that having a model that’s fast, smart, and emotionally intelligent on your computer – one that can actually access your computer – is a really great experience for programmers. It meant you could throw away a lot of the old stuff you used to have in a programming environment built for typing code. You could just type commands into your terminal and it would start working.

Anthropic figured out something bigger: if you have a great coding agent on your computer, it’s actually great for any kind of knowledge work. If it can write software on its own, it can do any kind of knowledge work on its own.

(00:02:57)

We started to move from a world where programmers were beginning to delegate tasks inside Claude Code to a world where any kind of knowledge work is being delegated inside Claude Code, Claude Cowork, and tools like it. OpenAI had their original split – vibe coding in ChatGPT, engineering work in Codex – but they saw what was happening with Claude Code. Over the last three months or so, they’ve done a hard pivot. Codex has gone from a senior-engineer-only pair programming tool to, honestly, my daily driver for this kind of work.

I use Codex for everything from deep engineering to writing to recruiting. They figured out that having a general-purpose agent on your computer – with the ability to write code, access your file system, use a browser, wrapped in a desktop app – is the ideal next step for knowledge work. And I think they’ve built the best current version of that.

What’s snapping into focus now is that there’s a new operating system for how and where you’re going to get your work done, and it’s this kind of agent management interface. That’s true whether you’re using Claude Code, Claude Cowork, or Codex. It’s becoming a race between model companies. Each one has their own desktop app for agent management – at its core a programming agent being used for knowledge work. Anthropic has Claude Code and Claude Cowork. OpenAI has Codex. xAI recently essentially bought Cursor. Google has Antigravity, though no one is seriously using it for this yet. But I imagine they will. That is the race.

(00:05:21)

For those of us who get to use these tools, it’s really important to bounce around between them. Using Codex, for example, lets you feel what it’s like to work in an agent-first world. Once you have an agent as your primary way of accessing software and the internet, it opens up all this interesting stuff that wasn’t possible before – you can send your agent out to talk to other pieces of software and come back with results.

You’re doing work on your computer through Codex or Cowork, and your agent is your interface to a lot of the work you do, a lot of the software you use, a lot of the things you do every day. And it’s actually really fun.

I wanted to bring Austin in because he’s our head of growth and he had his real agent-pill moment – tell me if I’m wrong, Austin – probably three or four months ago. It started with Claude Code. I remember you coming in on a Monday morning saying you’d spent the whole weekend on your computer, twelve hours a day, using Claude Code. You started using it for all the knowledge work tasks a growth marketer would. Over the last couple weeks as we’ve been on GPT-5.5, and after I kept telling you to try Codex, it sounds like you’ve shifted everything over. I’d love to get into your workflows and then do some demos so people can see things from your perspective.

Austin

Yeah, that sounds great. My agent-pill moment was spending a week going deep into Claude Code in the CLI around December into January, hooking it up to everything I do for work and my personal life. I use Warp as my CLI interface. What I found was that the things it could automate, the things it could handle for me, and the way it could work as a thought partner to make my work better – it was the only way I wanted to do the kind of knowledge work that requires strategic thinking, data analysis, shipping marketing copy. All this stuff that normally gets you spread across a bunch of apps and tools during the day.

Around February, Dan kept nudging me: “You really should try Codex.” And if anyone on the team says that, I’ll go try it. I like pushing myself on more engineering-type tasks to see what these models are capable of.

So I tried to build a personal side-project app in Codex, since that was one of the things Dan said it was really good for. My immediate reaction was: I think it’s probably better at building the app, but I can’t tell – because nothing had ever made me feel more stupid than Codex did two months ago.

(00:09:00)

I always use Compound – our Compound Engineering plugin that Kieran Klassen made – for basically everything, including knowledge work, especially if I’m trying to build an app or ship a PR. I made a plan. It came up with three questions about which direction to go, and I had no idea what it was talking about. Every question I was asking it to explain to me in more detail. Its response was basically: why? Just do what I’m recommending. I basically stayed in Codex for engineering work because I did like the results, even if I didn’t love working in it. But about 80% of the time I was still reaching for Claude Code and the CLI.

When we got our hands on the new GPT model a month ago, the first thing I felt was: at the very least, there’s parity between the latest Opus model and the latest GPT model for the kind of knowledge work I do. There are a few things Opus does better, a few things Codex does better. Outside of design, which I still really trust Opus for, it feels like a personal preference thing. But the real differentiator for me is the Codex desktop app. There’s simply no comparison in terms of how fast and powerful it is versus the Claude desktop app. I’ve never been able to get Cowork to work for me, and I think it’s because I’ve been ruined by the Codex app. It’s so fast. The sub-agents are so good. The way it suggests and ships automations for me – I can’t imagine not using it.

I wouldn’t be surprised if at any point the Claude desktop app catches up. They could ship versions where it’s faster and better. But right now, when I sign on during the day, Codex is the first thing I open. It’s pulling in whatever I need from Gmail, Slack, Notion, Stripe, all of our data sources. This morning I was like, “Oh yeah, we need a run of show for this camp.” I messaged Codex: make the run of show. It knew exactly where to look because we’d already had conversations about what we were going to cover today. It pushed it to Notion, sent it to Slack. It was perfect. It’s where I spend about 80% of my time working, overwhelmingly because the app itself is just so good, and the model has now gotten good enough to be my daily driver.

Dan Shipper

Yeah, I feel the same way. Someone asked whether we’re discussing the app or the CLI – we’re discussing the desktop app. I think you’re making a good point that both companies see the endgame here and are pushing in the right direction.

(00:11:51)

For a while it’s going to be a horse race – every couple of weeks or months, one will pull ahead with something amazing, and then the competitor will respond. I don’t have inside information, but I imagine Anthropic will release something in a couple months that puts it at parity or better. They’ll just keep trading. At some point it will slow down and you’ll end up with separate ecosystems, but right now they’re fairly easy to switch between. It’s not trivial, but it’s pretty easy. You can kind of ask Codex, “Hey, can you go grab all my Claude stuff?” And it’ll do it.

Austin

It feels that way when you do it. It’s funny – I’m in New York right now, I usually live in LA. Most of my friends who are in the knowledge work space have been asking me what they should be using. They’re all Claude Code or Claude desktop app-pilled. When I tell them I’ve fully transitioned to Codex, this look of horror crosses their face. “Do I really have to?” I tell them they don’t, but also: you really should right now. You would get a big benefit from it. And it’s interesting – and unsurprising to me – how resistant people have been. The Claude desktop app is genuinely game-changing. So the idea that the Codex app is maybe 30–40% better makes people think: “That’s a lot of work.” Which we can get into. The migration was actually very easy.

Dan Shipper

Yeah, let’s do it. I kind of agree – it’s more an emotional thing of “I have to learn a whole new thing,” but it’s pretty similar. I’d love to see your workflows.

(00:13:48)

Austin

Cool. So this is the Codex app. I’ll do a quick tour, since I know a lot of you have seen it, but here’s where I go and how I use it.

One thing I love about the Codex app is that it’s much better organized than the Claude desktop app. The ability to have folders with persistent, consistent chats inside of it that I can go back and check is really useful.

And the big differentiator is that because it’s so much better for engineering – occasionally I’ll ship a PR for one of our products – I don’t have to switch between the Claude Code CLI, the Claude desktop app, and Codex. I can be here working on our KPI sheet and then go down and ship a PR for Plus One, all in the same place.

I also tried the Claude desktop app updates last week when they shipped, and the stress test I put on it was: make a go-to-market plan for our new product and ship a PR to Sparkle in different chats. It was so clunky and slow. When you do stuff like that inside Codex, it just works. Quickly and well. Once you start feeling that, it’s very hard to turn away.

So I have different folders – some for side-project apps I play around with, one for my personal OpenClaw where I can manipulate things. The one with all the chats is what I call the Every Growth OS. All it is is a folder with API keys and secrets so it’s connected to everything we use for Every, plus some project instruction files that explain what the Every business is, what we care about, how we like to work together.

(00:15:36)

It also has some reviewer agents inside of it, all informed by how Compound Engineering works. Inside Compound Engineering – Kieran’s plugin – there’s a review step once you do some work that checks for security and a few other things. Those reviewers are often not as helpful when I’m doing a strategic plan for a go-to-market initiative. So inside this folder, I’ve got a fork of that for strategic alignment with company goals and for data accuracy. Having that inside the folder means that as I’m making plans, I can get targeted reviews from the model.

The first thing I wanted to show is how I’d recommend someone get started in Codex. I was actually walking our editor-in-chief Kate through this yesterday.

What I did was: through the plugin tool in Codex, I went in and manually connected all the tools I use every day – Gmail, Slack, Notion. Then I opened a new chat inside this folder, which was built through Claude Code. Claude Code built the whole Every Growth OS system. There’s a Claude MD file in there, saved locally and synced to GitHub. I just opened that project inside Codex when I started working here.

(00:18:24)

I start a Compound Engineering brainstorm workflow – it’s just something I reach for of “let’s think about this together, me and the model.” What I said was: go take a look at the things I use the most – Notion, Slack, and Gmail – and think of some automations that would help me with my work.

I find that when I’m trying something new, whether it’s a model or an app, having a very smart frontier model tell me how to use it and what it should do is exactly where I want to start, rather than trying to think of things myself.

Codex came back, looked at what’s going on for me and for the company right now, and I thought these suggestions were really good. It identified this kind of follow-up radar – a big challenge for people who do knowledge work, who do partnerships, who do social media marketing. There’s all this stuff coming at you across different sources. What if it handled the triage for you? What if it had this kind of command center when you run a camp or an event, which usually requires a ton of moving parts? And then for recruiting and hiring – we don’t use a tool like Ashby. We sync everything through Notion because agents can handle a lot of the pipeline and tracking work for us.

For the sake of this demo, I just told it: looks good. And this is actually the thing I’ve always been most impressed by in Codex – the automations it generates just work. They require very little tweaking to be something I’d actually use every day. There’s a set of instructions it comes up with based on what it knows about me. I can change when it runs, give it additional context, connect it to other things. Mostly it just works.

There’s one I now have running that compiles, at the end of each day, all the stuff I haven’t responded to yet, drafts the replies, and we can knock it out together. Sometimes all I need to do is give a thumbs-up Slack reaction, and it’ll handle that for me.

(00:21:00)

I think of agents like this as the “dumb” ones – they just do the right thing every time. Then there are the “smart” ones, like an OpenClaw or Plus One, where you work back and forth and have a more creative, strategic partner. Codex is good at building both. If someone is looking to see what this thing can do to help with knowledge work, I would start here – a brainstorming automation session – because it is fast, and you’ll quickly get a sense of what it can do.

Dan Shipper

This is so good. Your Codex usage is far surpassing mine in terms of interestingness. I’m getting a lot of ideas. Let’s pause here – normally we take questions at the end, but it would be nice to let people come up and ask a question now to see what the vibe in the room is like. If you have a question about what Austin just showed, please raise your hand.

Margaret, welcome. Please introduce yourself and ask your question.

(00:22:15)

Margaret

Hi, can you hear me? I’m Margaret, I’m in Plymouth. My question is about your review step. It says “don’t send, post, archive, or modify without explicit approval.” What does that actually look like? Is it like you call up and say “let’s do the review flow now,” or does it send push notifications to your phone?

Austin

Yeah. For this, what I prefer – and I was actually talking to a friend at dinner last night who independently came up with the same approach – is that everything gets drafted and set up in Codex, and then for my final review step I actually go to the external app. It will draft all the Slack messages, and I’ll go to Slack – which has that draft reply tab – and knock them out there. I find it freshens up my brain to be in the place where I’ll actually confirm: is this what I want to send to a human being?

Same thing for email. It creates all the drafts in Gmail, and I’ll open Gmail to look at them and send them. I know some other people who just approve right inside Codex, and that’s totally valid too.

For strategic planning, it pushes to either a proof doc – the agent-friendly markdown file that Dan made – or a Notion doc. But the only time I’m really leaving the Codex app to do something is for that final human review pass before things go to the people I work with.

Margaret

That’s brilliant. Thank you.

Dan Shipper

Great. We’ll do one more, then keep going. Alex, please introduce yourself and ask your question.

(00:24:12)

Alex

Hi. My name is Alex. I’m a musician and I do a lot of gigs. I get emails from clients all the time, so I have to sort my leads from newsletters and other informational stuff. How do you make sure you’re prompting Codex to keep the emails that require a personalized response safe – so I don’t accidentally send something that costs me a client?

Austin

Yeah. For me personally I rely a lot on Cora – the AI email assistant that’s part of the Every subscription. It’s really helpful. There’s now a CLI and API connector inside Cora that I can work through in Codex, so I can tell Cora – which manages my email filtering and rules – what I want and what I value.

But the way I’d recommend it whether you use Cora or not: have the agent interview you to get an understanding of what the rules should be. I always get a better result that way rather than just stating what I think the rules should be. I’ll do a brain dump using Monologue, our speech-to-text app, saying “here’s the problem I’m facing – my email’s a mess, let’s figure out how to triage it.”

It would work perfectly well to start a new automation in Codex saying: here are the things I want to make sure I get, here are the rules – never send anything for me, only draft; I want to go through all emails at 3pm on weekdays. Then: go take a look at all my email, spawn sub-agents to do a search across different workflows, and come back with a plan for how you’re going to set up my email.

Then you can read the plan and catch anything that looks like it might auto-archive something that could lead to losing a booking – that’s where you tweak it. And I also set reminders in Todoist – which is also connected to Codex – to check back on the new automation after 72 hours and do an audit. Prompt the model: what have you been archiving? Let me see if I’ve missed anything.

(00:27:36)

Dan Shipper

Thanks, Alex. I’ll add to that: one of the things we found – basically because Austin started doing it and I noticed – is that Austin started setting up his OpenClaw Plus Ones with Codex and Claude Code, and realized it’s just a much better experience.

Rather than the earlier version of Plus Ones, where we had a whole dashboard and onboarding experience where you had to manually click a bunch of buttons and provide a lot of context, it’s much easier to expose Plus Ones via a CLI to Codex or Claude Code. Then you just talk to Codex, and it takes everything it knows about you from your computer and past conversations and throws it into the Plus One setup.

It’s part of what I’m talking about when the world changes when you assume every user has access to an agent like this. We don’t have to have a settings dashboard. We don’t have to have an onboarding experience. We don’t have to gather as much context manually – it can just be given to us by Codex. That’s really interesting.

Austin

Yeah. One of my favorite use cases: I got really inspired by an interview Claire Vo did with Lenny, where she talked about the breakthrough she had when she stopped trying to use one supercharged master OpenClaw and instead built a suite of six specialized ones. I think that applies to any kind of agent setup – and it’s similar to the new ChatGPT provisioned agents.

(00:29:30)

My path toward building that suite of agents to help with the growth function at Every was just going to Codex and to this folder. I actually sent it the transcript of Claire’s interview with Lenny and said: I want to do this too. Given everything you know about me and my work, make a plan to suggest six agents we should provision into our Slack. Consider that some might work better as Notion custom agents – those are great for doing the same thing every day – and some might need to be smarter automations.

The plan it made was really good. I tweaked it a bit after seeing it, and now I have a suite of six agents in our Slack that work really well. They still break occasionally – I think when you’re making personal agents right now, you should expect them to break a bit. But the powerful thing is that rather than getting frustrated and going back and forth with the agent, I just go to Codex, either screenshot the issue or @Slack in Codex, and say “go find this conversation where this broke and fix it.” It does a really good job of changing the architecture of the agent and shipping a fix from there.

Dan Shipper

I love that. And now I want to paste that Claire interview into Codex too. Let’s keep going.

Austin

I want to show one more thing before more questions. This is actually kind of my favorite way to use this for knowledge work – something I wish I’d had for so much of my career, because it tackles one of the most time-consuming and frustrating things about knowledge work.

(00:31:09)

We’re doing a real go-to-market public launch for Plus One soon. There have been a bunch of internal meetings and Slack conversations about how we’re taking this to market, what the strategy is. We’ve done all the work that only humans can do – the marketing case, the business case, the narratives. Not all of it is as refined as it needs to be yet, but it’s all sitting somewhere.

I had plans this week to make the go-to-market plan – something I’m responsible for – and inevitably all this other stuff came up. Job interviews, the new ChatGPT model release date. So I had a day – I think it was Tuesday – in between meetings, and I’m prompting Codex: “Hey, I’ve kind of done most of the work.” In Notion, every meeting is recorded in a single place and all the transcripts are there. We’ve talked about this a bunch in Slack. I have a template for a go-to-market plan that I really like. And I went to Codex and said: can you just make the plan?

In my head I was thinking maybe it’ll get a 6 or 7 out of 10, and we can keep nudging from there. I asked it to start with the Compound Engineering brainstorm step, ship a proof doc, and let me see how close it gets. One thing it doesn’t do super well unless I tell it to is go read our calendar of upcoming posts and launches. So as it was going, I messaged: “You always forget this – actually look at everything that’s scheduled,” because I have to account for that in the go-to-market plan.

(00:33:15)

It made a plan as a proof doc. I had five minutes between meetings. I looked at it and thought: you basically have the architecture. Then I asked it to factor in one more change and ship the plan to Notion.

The plan it shipped to Notion – I was reading it and thinking this is basically 80–90% of the way there. Not because I’m relying on the model to come up with our go-to-market strategy. I’m relying on it to look at everything we’ve already said and thought about the strategy, piece it together, and then review it. There’s a lot of important context loading that happens here – it knows what our target ICP is, it knows our goals, it knows how we think about narrative positioning.

Before this was possible, my only options were blocking off a whole day to do this or staying up all night after work to write it. This has been such a game changer.

And the other part I’ve found really helpful is that I don’t make this plan for humans. I make it for humans and agents, and primarily for humans to understand through agents. When I send it to the team working on the go-to-market, they can read it – it’s digestible to humans. But it’s also the full plan sectioned off in one place, so Brandon, our COO, who’s deep in this product, can ask his Plus One or ask Codex or Claude Code: “Summarize Austin’s plan for me. Give me the business case.” Brandon needs to build the pricing model, so he can work with an agent against the plan directly.

I spent so much of my career thinking about how the proposal looks when I present it to the CEO – the fine-tuning, is this two-pager going to land? Giving that up and just asking: is the plan actually good, and is it going to make sense to Dan’s agent when he reviews it? That makes me work faster, the work better, and it means I don’t have to think about the dumb stuff that doesn’t matter.

Dan Shipper

I totally agree with that. The first thing you said: normalize sending agent documents around. That’s why we have Proof – it’s just an easy way to send the markdown documents we generate to each other and review them together.

(00:36:03)

There’s this whole strand of AI stuff about making AI write in your voice – we even do that with Spiral. But there’s another strand: just normalize AI writing. Because honestly, I would often prefer to read your agent’s writing than your writing – not because it’s better writing, but because I know it’s easier for you to get all that thinking together in a format I can read if you let your agent write it. What I care about is: do you stand behind it? Have you thought about it? If I ask you about a particular bullet point, will it be clear you’ve thought it through? As long as we have that trust, I absolutely prefer the agent version.

Austin

Totally. My friend Rachel Cardin, who runs the Substack Linked Bio about social media, had a really good piece this week about frustrations for people working in social – this pressure they feel to run everything through AI, and quality going down. One reason is that dichotomy of what you actually stand behind. Are you running something through AI and maybe your manager doesn’t even know what it says?

The thing I love about working at Every is that you show up to a meeting, you’ve shared an AI-written document ahead of time, and the expectation is that you’re going to stand behind all of it. If someone asks about what’s in that document and you say “I didn’t even know that was in there,” you’re exposed.

(00:38:09)

The nice thing is we keep investing in skills and workflows to make sure that never happens. I have rules inside this project file – don’t add anything I haven’t said in another context. Send your suggestions in the chat, but don’t put them in the document. And depending on how big the context gets, models can follow or not follow those rules – which is another reason I always leave Codex for that final review before things go to the humans I work with.

Dan Shipper

Yeah. And the last thing I want to point out from what you said: a lot of the time you spend working is about taking thinking you’ve already done and putting it into a form that other people can read and consume. The important part is doing the thinking.

There is something to writing yourself – I love writing, it’s a good way to think, and sometimes you actually want to do it. But there’s a lot of stuff, like company strategy, where the thinking happens out loud in meetings.

(00:39:18)

I’m writing something right now – a retrospective on the last three and a half years of AI and where I think we’re going – and it’s so hard to sit down and write, but it’s much easier to just dictate. I took a Monologue note where I was just saying stuff, and I’m using the AI to help me figure out what I’m really trying to say. In those cases, it’s so nice to just record stuff, give Codex access to everything, have it spit out a strategy doc, and then go through it and make sure it’s stuff you agree with. It’s such a time saver – especially if you’re someone like Austin or me who’s in meetings a lot and doesn’t have huge chunks of time to sit down and do a big strategy document. It helps you get that thinking done in the cracks of your day.

Austin

Yeah, me too. I want to show one more thing before more questions.

(00:40:15)

Austin

This is kind of a mix of knowledge work and engineering that would never have been possible without these tools, and it’s something I really love Codex for: rebuilding our KPI tracker.

We have so many different parts of the business at Every, and it’s very difficult to get all those data points into one source of truth in a traditional tool – even PostHog, which I really like and a lot of our data runs through. To get one dashboard that is both human- and agent-facing, up to date with all the metrics we care about, I haven’t found a great solution in an off-the-shelf tool.

So I’ve been rebuilding our KPI sheets inside Notion, with the goal that anyone can point their agent at it and see: how are new paid subscription trials doing? How are page views doing? How is Monologue iOS MRR doing? All versus plan. Because it doesn’t just help you work as a human – it really helps you automate agentic work. If your agent sees that you’re tracking behind on SEO for a keyword you should be winning on, it can just go ship a bunch of landing pages to try to capture more of it, if the sources of truth are solid.

(00:42:09)

I’ve been doing this complex workflow problem in Codex: let’s build this sheet together, have it live in a Notion database that all of our agents can point at. The first version was: can Codex one-shot this? It has all the API keys, all the context on how we measure MRR and everything. Each time it was a little off – maybe 5–10% off on the formatting, the numbers, the framing. And our MRR number can’t be 5% off. We can’t run a business where the source of truth is even 3% off.

So the thing I force myself to do – and it feels weirdly tedious given everything these models can do, but it makes sense – is going column by column to ensure each one is exactly right and defensible. It’s the only way we can run and grow reliably, and especially the only way we can confidently let agents take actions based on what’s in that KPI sheet.

I’m using Notion’s Workers tool, which is a dev tool for building always-on API calls – pulling in our Stripe data, our social data, creating little scripts. All stuff I don’t really understand technically, but I understand the outputs: a Notion database that updates every six hours with all of our metrics. It’s just nice that I can do that now without hiring a consultant or pulling time from our engineers who work on our data. I can do it by prompting the model and understanding how the metrics are supposed to work.

Dan Shipper

Amazing. Is it going to be ready on Monday?

Austin

It’ll be ready on Monday. Yeah.

Dan Shipper

Yes. Because figuring out how much money you’re making and how much you’ve grown turns out to be a genuinely philosophical question. You have to go in and set that frame yourself. We’ve been dealing with an outdated sheet that’s pulling numbers, but are the numbers actually correct? Even outside of AI, there’s no one way to measure MRR – you just want to do it the same way every time, so you have to decide.

(00:44:54)

Before we get to more questions, one other thing I use Codex for that genuinely blew my mind from a knowledge work perspective: recruiting.

We’re hiring a lot, and we were looking for a head of Learning and Development – someone to help us run courses. I thought about companies that have run really great in-person technology courses to teach people programming, design, that kind of thing. From the 2010s New York scene, General Assembly is the company that comes to mind.

My theory was that a good person for this role would probably have worked at GA. So I just said to Codex: can you get me a list of GA alums? I’m hiring an L&D director, and I want you to filter and sort the list by people who have subsequently gotten into AI.

And it did it. It gave me a list. The first person I clicked on, I thought: this guy is perfect. And then I noticed he already followed me on Twitter, so I just DM’d him. I don’t know if we’ll end up working together, but it was one of those holy-shit lightbulb moments. Normally you’re sorting through a ton of applications trying to find the right person. You’ll still do that, but for any kind of outbound effort, this can find that needle in the haystack really, really well. Highly recommend.

Okay, we’ve got about ten minutes left. Let’s take more questions. And one thing we haven’t mentioned yet: if you’re here today, you’re getting Codex credits. Austin, do you want to run through that quickly?

Austin

Yes. OpenAI has given us a code for 250 attendees of this camp to get a free month of ChatGPT Pro Lite – about a $100 value. You can redeem it at the link I’m dropping in the chat right now.

Dan Shipper

So this is our gift to you as Every subscribers. We try to do things like this all the time – we’ve given out Cursor credits, Notion credits, a lot of other stuff. We have more coming. We just want you to be able to try these tools and be at the edge with us.

One note: this is for new users only – it’s for people who don’t already have a plan. We’ll try to get something for existing users and send it out as soon as we can. Alright, let’s do some more questions.

(00:49:15)

Rich, please ask your question.

Rich

I saw at the beginning you were using Compound Engineering as part of your workflow. Are you using the off-the-shelf plugin, or have you made tweaks to it? And where does it work or not work outside of a pure code-creation workflow?

Austin

I find there’s no overwhelming need to fork your own version of Compound Engineering. I used it as-is for a long time for all of my knowledge work, and it was extremely powerful. Then about two months ago, the main thing I noticed was reading the agent’s response at the review stage – watching the reviewers that Kieran and Trevin had built, which are very specific to engineering, thinking about security and front-end design when I’m doing a go-to-market plan.

The agent will say: “I’m supposed to go through this review step, but it looks like it’s designed for engineering.” And it’ll change course and review for something else instead. So what I did was fork a version of it – it’s publicly available on our GitHub, called Compound Knowledge – which takes the Compound Engineering plugin (also public and forkable) and tweaks it for general knowledge work. I started in Claude Code, then updated it in Codex, and said: I want to tweak this for knowledge work. That’s what I was referencing earlier about the reviewers being more specific – strategic alignment and data accuracy rather than security.

(00:51:21)

More than anything, this is a really fun way to learn and push yourself with these models. You’re welcome to just use the Compound Knowledge version – we’ll include it in the follow-up email. But I got a ton out of building it. I’d never made a plugin like this before. And to make your own version – say you do social media marketing and you want all reviews to go through your style guide and your past performance data – I got a lot out of working this way.

That said, if you just want Compound Engineering to make your knowledge work better without modifications, it absolutely works really well right out of the box.

Rich

Got it. Interesting – the end-of-step review is apparently still valuable for you even outside of pure engineering work.

Austin

Yeah, the Compound step is really valuable. Inside our Notion we have a go-to database: after any session, you can send the learnings to a team-wide shared Compound source of truth. Whenever I’m done with any session in Codex or Claude Code, the agents are instructed to ask me: “Should we compound this – save it somewhere for the learning? And should we turn any workflow from this session into a skill so we can just do it automatically each time?”

Rich

Got it. I’ll check that out. Thanks.

(00:53:00)

Dan Shipper

All right. Rory, please introduce yourself and ask your question.

Rory

Hi, my name’s Rory. Is there anything about the way you work at Every – like maybe ending meetings a few minutes early – that you’d recommend to teams adopting workflows like yours?

Austin

Yeah, I think what I’m hearing is a very real challenge: it’s so exciting and alluring to spend a lot of your day playing with this stuff. You also find yourself thinking: “If I just get this automation right, my work is going to be 100 times better.” And I actually do find myself on a lot of days spending most of my non-meeting time building really good tools and automations, and not making enough time to do the actual tasks that have to push the business forward – shipping social posts for the day, or whatever.

I don’t have an awesome answer for it, outside of the fact that playing around is kind of core to how we operate at Every. It’s something Dan pushes all of us to do, and it’s one reason I love working here. It’s also, to me, the best way to learn and what makes me better at everything I do. The guidance I’ve given myself is that the automations in Codex help keep me on track to get the work done even when I’m deep in building mode.

Dan Shipper

Yeah. And I also read your question, Rory, as: how do we make more time for the AI playing and experimenting if we’re already really busy? What are the organizational practices for that?

(00:55:45)

My take is it’s just a cultural thing. We love playing around, and that’s part of our job. There’s also something happening right now where the tools and workflows are changing so fast that if you just focus on how your job currently works and run as fast as possible, someone using a new tool with a new paradigm will beat you by default. If you give yourself some time to play around, it may feel like a waste of time – but you’re leveling yourself up to a different game at a different level.

One organizational practice we have for this is called Think Week, which we do twice a year. We literally don’t do any of our day-to-day work. We spend a week together just playing with new stuff, building, learning, and being together. You don’t have to do a whole week – but doing something like that once a quarter for a day can be really powerful. Give people the time and space to explore, if you can.

Sweet. All right, y’all. That is our program for today. Thank you for coming. We love seeing you. We love doing this with you. Remember, Every is the only subscription you need to stay at the edge of AI. We would love it if today you would tell one friend to go subscribe to Every. We want to get more people in here. We’re right at this amazing point in history where we get to ride this big wave together and figure it out together. Please tell your friends. See ya.

Austin

Thanks, y’all.

Dan Shipper is the cofounder and CEO of Every, where he writes the Chain of Thought column and hosts the podcast AI & I. You can follow him on X at @danshipper and on LinkedIn.

To read more essays like this, subscribe to Every, and follow us on X at @every and on LinkedIn.

For sponsorship opportunities, reach out to [email protected].