# AI & I: How Anthropic Uses Claude Fable 5 — Mike Krieger

**Source:** Every — AI & I podcast, June 10, 2026
**Video:** https://www.youtube.com/watch?v=XWpTgCvgYaE
**Guests:** Mike Krieger (co-founder of Instagram, head of Anthropic Labs) with Dan Shipper

Auto-generated transcript with approximate timestamps. Formatted for agents: hand this file to your AI assistant and ask it to summarize, extract workflows, or adapt the prompts discussed.

---

**[00:00:04]** Mike, welcome to the show. >> Great to be here, Dan. Good to see you. >> So, for people who don't know you, you're the head of Anthropic Labs and you're the co-founder of Instagram. And today, what I want to talk to you about is Fable 5. So, Fable 5 is dropping tomorrow. We're recording this the day before. This will come out after it drops. But what I really wanted to do is bring you on the show to tell me about what it's like to use this model beyond the first day. I think when a model this powerful drops, it's so useful to have someone who's using it day in and day out to tell you this is where it's powerful. This is how what it actually changes. This is what it doesn't change so that you you're you kind of like don't you kind of don't get the same AI psychosis type thing. You can actually think about okay like this is how it fits into my life. >> Yeah, absolutely. And and it's also just been interesting, you know, we've had some, you know, models in this, you know, mythos class leading up to the fable release, you know, for a couple of months now. And it's, I think it's very exciting to see how people will build with us externally. But I think you're also right that day one impressions, I think it really comes from getting to use this over a couple of weeks. I think we've seen that even with previous models like the December into January

**[00:01:08]** usage Opus 4 5 or Opus 46 was really important because people spend extended time on the model and then figured out oh actually I wasn't pushing hard enough I got to go further and I got to rethink what's even possible with this generation. >> Totally. I mean I don't know I feel like there are people internally at every who have been using it who have been like oh my god I think I kind of need a new set of skills to use this model. And I think you can especially see this with people who are maybe more non-technical internally and who are more on the knowledge work side of things where they're like I don't even know what I would use this for. And the people who are orchestrating agents are like holy [ __ ] I feel like there's so many new things I need to learn. So I'm curious for you tell us about the difference between your impression when you first tried it and now. >> Yeah I think the your your point on on adapting workflows is a really good one. uh quite literally workflows. I'll talk about that in a second, but also just in terms of like how do I like think about um usage of the model because uh you know at first uh and the the timing was interesting because it kind of coincided with me transitioning from CPO into labs and going really back into into builder mode. I think it was about a month and a half or two months into that uh that we

**[00:02:11]** first had you know one of these models available internally and um I I sat there and I and I was like I feel like a total newbie again because I feel like the way that I am prompting or even thinking about decomposing a task is really out of date now with this model like uh uh it's no longer and it's even thinking about the time horizon or the sort of like interactivity model I think has to evolve as well like going from I think early on would be like I have an idea for this feature. Can we start by do like absolutely not right? Uh to great like let me express more of the intent and then just being you I remember like you you know you know Raj April be like wow on the one shot it's already incredibly impressive but then it also understands the intent around how we're going to evolve this and understands like the global context as well. So I think that's been a really interesting evolution till now where um you know I was funny I was talking to somebody this morning where you know I think about doing work I had a flight and I was like okay I can do most of

**[00:03:13]** this work remotely and I don't even worry that like the Wi-Fi is going to drop out because I know that if I set up the right you know context instructions like flash loop you know I'll see it it'll see it through. Um, and and I think my last two months have been full of a lot of times where I will, you know, wish Claude a good night, set it up on like a pretty complex task of something of this like model class and wake up to, you know, actually it's usually done by like 2:00 in the morning and I guess it just totals its thumbs for the next four hours. But like uh really impressive ability to like complete the swing, get itself out of the situation where it's like, okay, all right, well Mike asked me to do this complex task overnight. I got stuck because this remote service went down. I'm going to write a like scaffolded like backend for it for now. So, you know, I'll document that. I'll, you know, go all the way through. Uh, I have a like good mental model of like how far that's going to get me and when it comes back online, I'll fix it. I'll keep track of that fact. It's just like it it is I think the most impressive thing for me is like you just be able to like

**[00:04:17]** delegate that kind of level of task and just trust that the right thing will happen by the end. And of course, like you'll review the result and there's still like a whole verification thing that we we can and should talk about because I think it's an important part of still completing the the the swing there. Um, but it's really forced me to rethink like what does being productive with one of these models look like? And it is much more like we've talked for a while about, you know, like what is it like when these models are more of like a companion or a co-orker and it really feels like now it's like a teammate that I can delegate like a lot of work to. >> And what is your what is your day-to-day flow like right now? Because one of the things I noticed is if you if you just give it a big task and you monologue into it and you just like let it go for a few hours or overnight, it's like the most impressive model that I've ever tried. But, you know, it's so slow and it's so expensive that you you I I feel like I don't want to use it for day-to-day tasks. So, what is your actual flow like in terms of how you use it day-to-day and where does it slot in versus other models? >> Yeah, I've ended up having a lot more um architectural planning conversations up front with it as well. So that's been like another interesting change where um

**[00:05:21]** I think this is an era that I think all models need to continue to improve. And I'm really grateful for the Instagram experience of having to like start, you know, from our initial version that was like duct taped on a server in LA to like being able to scale it and eventually integrate it with like all of like the Facebook infrastructure. Uh because you kind of develop a sense of what what infra abstractions and complexity are appropriate for each stage of it. And um I I still go back and forth with Fable where it'll be like this is a good you know implementation like well I do plan on shipping this like fairly soon like I think we should probably think about more than one server and kind of like that back and forth um is important but like a lot of that sort of planning and I'll often actually ask it um it's kind of a the thing I've realized is um Fable can like be so um uh sort of sort of complete in its thinking in terms of how much you are uh sort of planning with them like often just saying can you just like make an HTML page like that represents what we just talked about so I can share it with the team is actually valuable or even just a markdown document but I like having diagrams. Um, so that's been an

**[00:06:24]** interesting uh like use of like let's plan with it, let's think it through and then let's have some sort of document that we can align the team on because and this is a dynamic I've seen in labs and just teams beyond anthropic which is you can build a lot very quickly and forcing more of that early alignment even if you do an initial prototype and then back it out into more of a sort of plan architecture that works too. Um I think is really really really key. Um and it actually being ends up being the place where like the human to human interaction still stays very uh you know very much part of the process. Um and then from then on I think you know either overnight or during the the the day like having it execute on those chunks of tasks is really important and it just means having a lot more concurrent sessions than I did before because I often will think all right there's there's these three pieces of work. I go back and forth between liking having one like very long running cloud code session and really asking it to do everything in background sort of forked sub agents so the main thread stays responsive and then other times just embracing like I'm just going to it's one of those days where I'm going to

**[00:07:25]** have like five or six uh tabs like tackle like long comprehensive work. But I do think that there's something to this like long horizon and like uh don't you know don't worry I'm I'm on it. can take me a while and like more of like this back and forth and that that modality I think is something that we'll have to figure out in our products as well. I think you want to preserve both and they they interact with each other in interesting ways and like my preference is usually I always like having at least one cloud that is high context but also very very fast response and like its instinct is great I'm going to answer you and I'll kick something off if I need to and if not I'm just going to you know hang tight and and and wait for the next kind of loop. Um, I do think you're right that for the I'm just trying to fix this, you know, uh, interaction question or something that's like very fine detailed. Um, like Fable will go off and think very hard about those things. And I think, um, Fable is the first model where I've actually played more with the effort levels for that reason where I've been like, okay, this is I just needed to like tweak some UI. I'm like, you know, f like know put it to medium or something and see how that plays out. Didn't find myself doing

**[00:08:27]** that as much with Opus. because the range felt less like wide where it really can feel quite wide with Fable. >> What about like a quick question like you're you're on the go like are you asking Fable you know random questions as it as they come to you because it feels like you're using a rocket launcher to kill a mosquito or something or are you flipping back and forth? >> It's so funny you asked that because um I had been and you know you're like it's thinking it's thinking really hard about it. Then um last week like you know I was asking it something that like true I felt embarrassed actually asking Fable about it was something like probably something NBA finals related and I was like okay I switched my iOS app to to s I was like oh yeah I she use this all the time for fast question it's like order of magnitude like feeling of like and it's actually not even the the sort of like tokens per second it's actually probably more around how much thinking goes into the answer and some like the answer does not need to be fully thought through. So yeah, I I am I'm thinking myself through and I think this is a good product question for us too, which is um you know in general you don't want

**[00:09:32]** people to have to be thinking so much about these choices. So ideally what we can sort of coales around in the longer run is sort of you know maybe like some more bucketable use cases that are really groable to people or maybe it varies by surface where it's actually probably unlikely that most of the time with the iOS app I'm doing fable type tests and you know having a sticky model selection per surface might be the way to do that and um we'll have to sort of explore what that means from a product perspective but I for sure have had the feeling of like this this is not a fable worthy question I I should ask son >> um can you show us something that you've built with Yeah. Um so one of the things that we were we we did this this go around is um we encouraged uh personal sort of like account usage for us like especially on the weekends uh which was really fun because you know we have you know you can imagine like a lot of anthropic specific you know tooling etc. But it was really good to sort of step back and be like I'm just like you know pure cloud code. Let's like work on something over the weekend. >> And you're in you're in the terminal app or you're in the desktop app? >> That's a great question. I'm mostly still in the terminal app. It's

**[00:10:34]** interesting watching my wife who's like not a professional engineer and more of a UX designer PM uh like really fall in love with cloud code via the desktop app and I think it like sort of simplified some of the of abstractions for her in that way. Uh but for this one I was still uh is it Ghosty or Ghost TTY? Ghosty and uh uh and and the terminal app. Um but let me show you. I um uh this is one of those like everybody has some bespoke need around this like I wanted a good sort of media tracker uh experience and I was like you know I'm playing games uh like I'm watching TV shows I get all these recommendations and I just wanted to build something um like that was personal to me and like sort of fit some of the use cases that I that I had. Uh and like I like the two biggest criteria that I started with was like one like really easy to add things and so like you can talk to Claude. Claude does the gentic search over everything and then puts the right things in and then also proactively like you know there's a new season or a new like sequel to a game that he could go off and and and research those things. Um most of the UI was like you know Fable Oneshot which was which was already impressive but

**[00:11:37]** then the the the thread I've been pulling a lot in labs this year is um how do you sort of bring the software team which is cloud these days closer to the software itself. And so, um, this was like maybe, you know, Saturday morning. I had a full weekend with with kids stuff. So, a lot of this was sort of kick off work, go do, you know, go for a hike with the kids, come back, you know, continue to do the work, sometimes check in on the work on the hike. I probably shouldn't, but, you know, it was like nice to like pop into remote mode and see what was going on there, you know, uh, try not to do that too much. Um, but I had this idea around, hey, like uh could you could we like do a spike on I say spike a lot with with these models like can we do a spike on like what if you could actually modify the software from within itself which is you know uh and it was I built both it was like a react native version and then this version which is just the web version. Um, so I already had like a chat type thing where you could sort of ask cloud to, you know, add things by URL, which is like, you know, I want every software to have this where I should never have to like navigate a menu to do anything ever again. And this is like in many ways, Dan, like the I

**[00:12:40]** was trying to distill the like agent native architectures to like its like fullest degree, which is like also have the agent be able to modify the app. like maybe like phase one of agent agent architecture like every single thing in in in this product is uh you know accessible from the agent and and and like has tool calls etc that's like you know hopefully becoming table six it was sadly not uh in a lot of software and it's great cuz I was like what's that like somebody had recommended there's a Brazilian there's like a show about radioactive stuff in going I did not remember what it was called and cloud was able to figure it out it's like so much better than being like trying to figure that out intuitively but then the next step I was interested in is like what would it mean to actually be able to modify the software from itself on the go. And so if you long press this little chat thing, so what it actually what I built, >> what Claude built um was uh a way where it used uh uses our manage agents uh to basically take on like edit requests and then you can preview them and I used like the versel life preview thing here. Um this like this whole like feature was also one shot which was really cool. Um and I just added to it over time. Um but you know it's like it actually does like

**[00:13:42]** a little diff view if you wanted to. you can go into the manage agent conversation and see like what it did. Although I almost never do because again it's like especially don't particularly care on like the code quality of like or the like long-term maintainability of this software. You can see that it had a session in here um too. But it's been really fun. So I'll be using it on the go and say like you know I had a feature request the other day like um oh like the floating action button was too low on native iOS but it was okay on on there like can you go if it did it. Um, it was really fun, uh, with some of the like expo tooling now and actually like live reloaded on my phone, which was also like a really cool, uh, kind of kind of feeling, but it was just like, you know, does this thing need to be like a, you know, production level thing that's going to go to a million users? No, but it felt really good to have something where I felt like it didn't have to stop at just the weekend and I could keep working on it just by using it um, and having this like kind of end to-end close thing. So, I felt like this was a good manifestation of both like Fable's building ability, but also like I think a lot of what both of I have been think both you and I have been thinking about like how does cloud embed

**[00:14:45]** itself and like into software beyond just even the usage side of things. This is really cool and I want I want people to understand like so this has been built you could build something like this maybe not the selfmodifying part but you could build something like this for like 10 years or 20 years or something like that but the the cost to build has like gotten dramatically lower. So think about how much it would have cost to do this in the Instagram days >> um versus now like can you help us understand like how that has changed? Yeah, I think and I think about this a lot when I think back to that that time as well because, you know, I I thought of myself as a very productive programmer in the early Instagram days, you know, I was like really into mobile development and and we had like a good clarity of of things and I think the the gap from idea to fully realized version of like some complete product like you were still looking at, you know, 4ish days of kind of my allnighters which was like my natural state is up till 4, you know, sleep until noon which not conducive to family life so I've had to shift but that was like my my building

**[00:15:47]** thing but yeah call it you know Instagram v1 which you know probably had more features than than this thing did but not by an order of magnitude was like five days of allnighters me working on like the sort of front end and and back end and Kevin working on the initial filters um to get that that out and um and this was also like you know like built on already you know many years that I've been working on on on iOS pieces as well and then the iteration you know I think a lot about what we were gated on after that launch when things went well was we had all these ideas for where to take it but we were just trying to keep the site up or we were just trying to like add the one incremental feature and you know hashtags take a week to build but then there's like all the things that you want to continue doing on it um as well and so I think it's both that shortening of time like there's still the time required for the idea and the the concept and the iteration and then the other piece which is the way you can then iterate on what you have and I think a really I really fun but also like very you know sort of in the float uh kind of way. Um and then you know if now this is me as a sort of

**[00:16:53]** professional software engineer sort of startup founder beyond that if you had that idea you know and I saw multiple people go through this like well I'll try to find maybe a consultancy that will take this on but like now there's like it's a really lossy process of like what I wanted you know don't raise money for it and I think that um the thing that I think is like the most exciting part about these models getting not just more autonomous but again closing that gap between intent and execution is what I've seen it do to people's ability to build who are not like builders and um the trajectory of these models has been you know something able you know of this general mythos class is like in that class of models and eventually you know models of you know that are cheaper and more accessible to to other folks become available too and like as that process happens like I just think it is just opening up so many like I got a ping the other day I get very excited about the

**[00:17:55]** stuff if you can't tell from somebody internally and uh we had built them an internal tool that kind of combined uh fable and like access to some internal MCPs um and she said like it is the first time in my life and she works in recruiting she the first time in life where like I feel like the thing that's in my head and the thing that exists in the world is now like they're right next to each other like I can just do it And uh it was like very like a meaningful moment to her because prior to that like I remember these days these days were 5 years ago or four years ago where that person if they wanted a tool would have to either make do or try to get an internal tools engineer that probably was overloaded with 50 other you know requirements. Um but instead now they like are just having the time of their lives building. And I think that is I think that's caused for a lot of like hope because I don't think that human capacity for creativity and what's possible is enormous. And I think like at our best we are basically uh expanding the number of people who can then see that through to something that feels real. >> I totally agree. But I do think that there's a question in the back of my mind and I think it's probably going to

**[00:18:57]** be in the back of the minds of some of people listening. So I want to ask you given everything you just said is software engineering over? >> Yeah, I think software engineering is different. It is like dramatically changed and as I as I probably would have defined it if you had asked me around the Instagram time like what is software engineering? I'd probably say like all right like thinking through the hard problems and like thinking about an architecture and then like spending a lot of time in uh you know like text mate I don't know where that came but like you know like text editor you're going to edit those things um or xcode you know and >> watching rails you know >> exactly right exactly and uh understanding the intricacies of Django's like layer and then like fixing bugs after you deploy it like so much of that is radically different um and collapsing into other parts of like product management and I think that sort of like PM edge split I think you guys I see it even in our teams has become much more diffuse um that's radically changed but I think the overall like like maybe zoom out from software engineering and

**[00:20:00]** think about like software production or you know software development but not in like just a pure developer case I think that is like alive and well and and and essential still so I think that it that is the moment that I feel like we are And um I think Fable is another step on the direction of and I'm not going to call it the final step of course a lot will still happen but like I think a pretty significant step in terms of like the trust at least I end up placing the model in terms of its capacity to see things through and even you know architect things reasonably is quite high. So that part feels like it is is not ever going to be done but it is pretty pretty done right like it it's gone really far. But I think that the overall sort of craft of the what needs you have like what are you putting out like is it actually good? um I think still a very human endeavor but I also sort of can see that that is not a transition that is sort of painfree in a way like I think there are plenty of people who love the craft of like actually putting and I used to love

**[00:21:04]** stuff like I solved that problem so elegantly you dream about code and if you ever had that experience of like you dream about the thing that you're working on like wake up in the morning like I figured out how to solve this thing really elegantly um and and that for sure has has has passed and I think that there's you know, there there's there is a a feeling of loss, I think, in some of the like better engineers that I talk to, as well as the feeling of, oh my god, but I can do insane amounts of work now at the same time. So, we're holding both ideas in our heads at once, I guess, >> which I think is the most important part of this. Like, it's normal to feel sadness for that kind of thing and excitement, but I'm curious. Let's just take the thesis of software engineering is alive and well. What does that actually look like inside of anthropic? Yeah, I think there's there's a few pieces. I think there's still the the crafting of well, I got to take it off from like the full software development cycle or like maybe what I see on a day-to-day. Maybe I'll do a little bit of both. But I think there's still a lot of u you know, we all got together, we we talked about the next way we want to, you know, evolve co-work um and now

**[00:22:08]** we've kind of broken it down into areas of ownership. I think that ends up still being quite important because there is still context that you hold as a person that is sort of beyond claude, right? Like what is the actual intent of this product? How's it going? What do we need to know about the sort of other products that are coming down the pipeline that are going to be integrated in some interesting way? Um so I think that aspect is really important still and so you know though we have many clouds to each human each human at least the way we've been working on topics still kind of has you know we call them DRIs like directly responsible individuals still has like a DRRI ship over some part of the product or some area. Um I think that'll be the case for a while because I think there is value in not just this distributor like we should all make co-work better but instead like all right I'm thinking through how co-work does at this particular um task and there's still a lot of you know the we try to keep meetings minimal but they they still emerge and you still have these kind of alignment conversations um then like a lot of that sort of asynchronous um delegation I think what many engineers here have now found is they've they've all built um and I think we should solve this at some point at

**[00:23:10]** like a broader product level but they've all built some version of All right, I'm going to now like create a dashboard of where all my clouds are doing and what's waiting for me and which pull requests like need my attention because you know either a human or a cloud code reviewer got back to me. So there is a lot of that sort of um uh meta maintenance of the of the work that I think uh again I think we'll standardize some but I think some of it will always be a little bit bespoke to the way each individual likes to work just in the way that people organize their windows how they organize um their work and then there is I think also the um understanding how things work in production and I think that is another like there's a few like next frontiers I think for the models and I think one of them that fable does you know make significant strides in but I there's there's more work needed here is understanding what happens to code after it gets deployed you know because there's incidents there's you know this was all working well but like this network link got cut which is not in your usual failure mode and like it manifested like so much of Instagram like 2012 to 2016 was like dealing with that and scaling things up and so that

**[00:24:14]** role of the engineer still remains really key and I think getting the the reps in around incident response and understanding how to stay calm gather data like uh remediate what's immediate yet, but then like go off and and and work on on on longerterm fixes like still a necessary part of it. Um, and then I'm trying to think if there's any like other pieces that are that are notable as well. I think what's maybe the last the last thing to say is I really like the role that the engineering prototype now plays. Um, you have to be clear when it's a prototype versus not. Um but um you know the old phrase was like code wins arguments and I never like loved that because like kind of uh sort of the person that could code could go do it but actually like why should they necessarily win an argument by by by default but actually it's been really cool now where some we will have some disagreement or some sort of uh debate about where to take a product and often it's the PM that will say all right I just tried it and like jank in like these eight ways but look it actually shows like how this could

**[00:25:16]** work and that that can open up some some interesting pieces of conversation. Um so almost all of that is quite different than it was 6 months ago. I think especially at the level of parallelism and the level of need for these kind of higher order abstractions of work. Um but I think what hasn't changed is that ownership. >> Lots of us are shipping AI to production which is great for productivity but it also comes with anxiety. You tweak a prompt, swap models, adjust parameters and everything looks fine in testing. So you merge and then 3 days later or even sooner the support tickets start rolling in. The AI is giving your customers unexpected answers and you have no idea when it happened or why. Brain trust is the AI observability platform that fixes this. It connects eval and observability in one workflow. That way you see what actually happened in production and can measure whether changes made things better or worse. Traces show the full execution path. Evals define what good looks like and experiments let you compare prompts and models side by side before shipping. Production traces feed directly into your eval data sets. Every failure becomes a test case. You catch

**[00:26:20]** regressions in CI before they reach users. And teams at Notion, Stripe, Zapier, Verscell, and Ramp use it to ship quality AI at scale. Brain trust is designed for teams building production AI systems where silent regressions are expensive. It's built for any stack. They have SDKs for Python, TypeScript, Go, Ruby, C. There's no framework lockin or vendor dependencies. It's sock 2 type 2 certified and GDPR and HIPACO compliant. Get started at braintrust.dev. That's brainustrust.dev. And now back to the episode. Fable is also very expensive. And because of that, like when I was testing it, I felt kind of like I was a kid in a candy shop and I was just like, I'll do this and I'll do this and I'll do that. Um, but now that there's going to be a bill, I'm going to be thinking about it. Uh, because I have to pause before I do it to be like, is this going to cost me 100 bucks or whatever? And I do think that's going to limit who gets to use it and for what. So, how do you think about that? >> Yeah, I think it's most clear-cut on the sort of professional software, you know, sort of classic company doing work. Um,

**[00:27:24]** it'll be really interesting. It's like, you know, a lot of process thought goes into pricing as well. There's like um it's both more expensive than Opus and then also I'm like in many ways it's really cheap if you think about, you know, like how much incredible work it's doing. But of course like everybody has their own economics around what they're what they're what they're working with. So anyway, most clear-cut I think from most sort of software teams and I think as an industry if like phase one was uh companies even struggling to get some of their employees to adopt AI coding which models were early maybe the tooling wasn't there and then phase two was great we'll create leaderboards and see who can use the most which you know as you can imagine creates like some like also like not ideal incentives to phase three where people were like okay now we're just trying to figure out who's using it effectively and like letting them spend as much as possible having a a a clear process for that, but making sure we're not doing things wastefully, which I think to me in general makes sense, although I think you could like also over rotate that way, too. Um, I think something of fable class should hopefully fit in well into that where if you're demonstrating results and you're

**[00:28:27]** getting use out of the model, then that hopefully there's a flywheel even inside companies where that goes and and and and uh perpetuates that. I think on the personal use side, it's a really good one. It's a really good question. And I think where I've seen it, you know, even in my personal testing because our personal accounts pay um which is funny like paying my own company I work at but uh but you know you do become more more thoughtful about it. Something that was interesting was this uh the app that I built over the weekend actually fit in with like only a bit of extra usage. So it wasn't like a you know thousands of dollars to build this thing that like is a personal thing to to myself but it was also spaced out a little bit more. Um, probably the the in between of that, what we'll probably have to do the most thinking about is the sort of hobbyist or like independent who's like not, you know, within the larger company, but also uh is thoughtful about about the pricing as well. I think my overall advice is like just give it a try and see how much it can do without you having to then do a lot of follow-ups. And it's like I think measuring cost has gotten so uh multiaceted now because

**[00:29:31]** there's the per turn cost and then there's like what did it cost you not to just do the task but like complete the task to your satisfaction. And I think that's where Fable has really shined for me which is it actually just does it right. So then I don't have to go spend the like 9 10 subsequent turns be like no that was not quite what I meant. Like can you also do this um piece? It's been really impressive for me because you ask it to go do something and then it just does it does a thing and you're like, "Wow, you thought through all the little details of this thing in a way that I've never seen another model do. I don't know how much you can reveal about the training process, but what makes the model different?" >> I mean, I think in many ways a continuation of a lot of the work that the the team has done and I like bow down and total awe of our of our teams both, you know, on the pre-training and on the RL side. I think that the the piece that it has evolved in that at least I noticed the most is kind of adjacent to that as well which is um a sense of the system more than just the individual piece of the work. Like I will often be very positively surprised when it will write something and say all

**[00:30:35]** right but you know I know that like in production this needs to be different like and then it will keep bugging you like have you turned on that like feature flag yet? Like it's not going to work until you do. Um, and you know, I'll sometimes be in sessions that have gone on for days and be like, "Look, you still haven't done that thing. Like, you better like I was like, you're right. Like, I didn't turn on that feature. Like, I should go off and do that." Or, um, if we change this, the contract will change over there. Um, or watching it. Actually, one of my favorite times of seeing it in action, I think, where it demonstrates some of the some of the training is watching it respond to code review feedback either from people or from from other Claude reviewers. Um, where it doesn't just say, "Oh, yeah, that's an issue. I'm going to go fix it and actually be really thoughtful around hey like for this level of like sort of fidelity of what we're building. I'm going to accept this risk or yeah I see what you mean other code reviewer which is often just another fable model like talking to her like I see what you mean but uh like I'm actually going to push back. I don't I think that that's actually not right. I think getting the model to have that judgment is really

**[00:31:40]** important. And I think um if I had to pinpoint like an area where I feel like it's really progressed, it is that sort of um not just immediate knee-jerk, yeah, yeah, that's right. I got to go fix it and more, huh, I'll think about that for a minute. No, I thought about it and I still disagree, you know, and I think that's a very um useful um uh sort of ability. It's so valuable to have products like cloud code out there because you have now like a living breathing thing where people are like this is where the model is doing well and like you know um uh we have like people who test it I've count the every folks as like very very high on the list where like we really trust the feedback because it is being put to paces and like repeated multi-day you know hard tasks and that also like very much feeds into how we think about like what do we need to improve on the next slide like what are the tasks that we need to specifically think about the model being better at. >> Is chat the right interface for this model? Because it's not very turnbyturn. It's it's very like I'm delegating something for you. So, how does that change how you should use it or how you

**[00:32:43]** think about the interface? >> I don't think like the fundamental like you are like sending messages and it is giving you a message back is like totally wrong. I think that there's ways we need to evolve but like one is maybe like three that come to mind like one is uh is your laptop the right place for it. So I think that's number one where I mentioned with the side project I was working on how useful it was to have the mobile side. Um Boris who who created cloud code he's always like you know ahead of the curve on on how these models get used uh about almost a year ago maybe nine months I was talking to him he's like yeah I've moved a lot of my cloud code work to mobile. I was like no way and like uh it took me a while to get there but especially with the fable class like there's often times where you know because it can keep the session going and we we use like kind of remote dev boxes and anthropic like it is like a thought and be like okay I need can you keep keep up and doing that. So I mean number one is like decoupling the uh the where the work is happening from where I'm talking to about the work. The second one touches a little bit on what I was mentioning earlier around like what are how do you take everything that Fable has sort of discussed or decided

**[00:33:47]** or proposed about something and make it comprehensible and that's an area that we're thinking a lot about um like there are some skills that um are out there or that we've used around like all right can you diagram this can you do that so that's a place where the current chat UI I think is insufficient where like it will you experience this with it will give you like a lot of tech like this I need to like take a lot of property to fully understand this and I think that um uh that is a piece of property I sometimes will do with fables like okay like you have a lot more context on this than I do can we like back it up like like let's do like more progressive disclosure of the complexity here um so I think that that that piece is interesting the last one that I I you know I think is we're still early in pullon um is thinking through multiplayer where you know at some level like these the abstraction levels and like because we have this sort of DRRi and like ownership area usually like a chunk of significant work a human and a couple of clouds like that is still flowing together but in other cases that is less the case right where it's you know maybe it's an incident response

**[00:34:50]** where multiple people are thinking about it maybe it's uh you know a project where there's you know multiple competing or not competing but like uh uh conjoining areas that are coming together um and thinking through like what would it mean for you know and we have like chat sharing which gets you a little bit of the way there. But I think there's going to be a need for more like, all right, you've got an independent cloud that's doing a lot of work that was, you know, kicked off by somebody, but can it be keeping up with all the other work happening on the team? I think that is an interesting and underexplored sort of next frontier about how this uh work ends up happening. Um, but I think it's really exciting because I think again it's it's the uh it's the level of teammate collaborator that that the models are now capable of and we're almost holding them back by not having the right abstractions around them for that to happen. >> Yeah, it makes me think I' I've mostly been using this for my own vibe coded stuff. So, so I haven't really had to I I haven't really had to think about this, but there's a there's a problem when you're using this inside of an organization, which is do I really understand every part of this and and therefore how do I transfer the context

**[00:35:55]** of what the model just did into my brain? Like that's that's one of the big bottlenecks. How do you how do you think about drawing the line, especially with a model like this, around how much you actually need to understand and how to make sure that you have enough context on what it's done to feel comfortable? I >> I think there's like two big pieces here. The first is verification where um I I became like fully verification pulled earlier this year and now like almost in the same way and actually it connects to how uh I think I used to do when I was sort of typing code more full-time which is try to find the sort of tightest dev loop that you can around the idea that you're trying to develop in like sometimes with Instagram that meant like you know actually build making a new build target in Xcode that was just that screen with some sort of synthetic data and just doing that dev loop and I and I would enter newer engineers like if there's one thing that I can impart on you like it is try to get that for any project you're working on and things will go much more quickly. I think that is no longer exactly the case here. But I think what is the case now is anytime I set it up like how do I get like for every pull request that

**[00:36:57]** cloud is putting up that there is an attached you know photo or video whether that's an iOS PR whether that's um you know something in the UI and that's I think that that that helps you gain a lot of confidence because even now you know you might have like you know uh fable go off and do work for a couple of hours and be like it's I'm done and it's really useful to say like and here's the like full screenshot gallery of the full right because you might say like oh you know what on screenshot hate that error state. I've never actually seen it, but I could see how, you know, a person might hit it. Let's actually make that different. Um, and so getting that comprehensive verification, I think, uh, something we've been working on a lot internally and like sort of publishing more and more skills and knowledge about, but I think is is a really key piece there. Um, and then the second one is I think you ultimately as a person still need to stand behind the work that you are doing, especially if you're putting it into a production system. Like a lot of people use cloud every day. uh there's still the accountability of like oh it's still a lot of might have written it but like you need to understand you know the the the at least the the general decisions that were made on these pieces as well and so I have seen uh a fair amount of engineers

**[00:38:00]** actually adopt this practice we're like cloud will have done the work but then there is like the follow-up conversation around well can you like can I make sure I deeply understand like all the trade-offs that you made and and and that and whatever lowercase a artifacts need to be produced in order to make that comprehensible um is important. It is really interesting though to be in meetings where somebody will say like oh yeah and I have this this PR ready and somebody else ask like oh that's interesting like did you do X or Y and have that moment of pause and they're like you know what I'm not entirely sure I will find before we merge this PR and that's you know I think that adapting to that norm and figuring out how to work with that is something we'll have to do. >> Tell me more about the verification loop such a it's such a hot topic right now. Sounds like one way that you do that is with screenshots and screen shares, but what are the other ways that you think about that? >> I think part of it it starts in can you get to a place where you are uh exercising real like uh sort of real flows that aren't just like a static injected piece and the system gets more complex that gets more and more complicated. Um, so we've invested a

**[00:39:03]** bunch into like even just getting it so that the, you know, the iOS app can log in to staging on a real account and like have real data, but then you don't want it to then go through like an eightstage onboarding process every time. Everybody just trying to test like the second part of the screen. Um, so there's a lot of work around like how do you, you know, is there a special affordance? Is there like some shared secret? whatever that is around getting the the the the like app, you know, to really feel as human, you know, using the product as possible. So, that's one one aspect of it. Um, the second is like this mix of like well-known paths versus the things you're exercising in the exact moment like the former being really useful for regression testing. And so, we definitely have places where we've expressed like uh sort of ideal workflows in text basically and the cloud can repeatedly check that. And then there's also and cloud does a really good job of this sort of expressing the intent of the current change at hand. So that gets really really deeply exercised. So I think that the combination of those two things is important. The visual verification that I mentioned as well. Um video has been really cool to to see. Actually video is

**[00:40:06]** a very underexplored tool to give Claude as well. Like a thing I've been prototyping is uh just giving Claude uh video captures of the thing that it has built and then giving it just basically an FFM pad and you'll watch it scrub through and say like, "Oh, this animation has some jank in it. I'm going to go fix that." And it never would have been able to do it with like a screenshot sort of uh latency capture because it will have missed the moment. So I think that's u that's another piece that is that's really really important. And then for the pieces that aren't sort of easily testable end because there is some more complex system um getting cloud to go and build like as robust a sort of you know mock back end as possible or use ones off the shelf has been also really interesting like like when I think about artifact um we had really comprehensive tests this is kind of prel and one of the ways that we were able to do that really robustly was that basically every piece of info we had whether it was postress reddus um you know all the WS things had a really good in-memory implementation that you could just do really quickly in unit tests and kind of extending that to like cloudland now you know I was working on something

**[00:41:09]** where it had like a pretty robust backend and for kind of complicated reasons it was hard to spin that up on my dev server but it was able to again oneshot a really like proxy for that uh by proxy I mean like a substitute for that and that was so valuable and over time it's been interesting as that like uh substitute has evolved as the rest of the code has evolved which is the thing that you know if you had pitched that idea to me before I'd be like well that's going to be really hard cuz the upstream's going to change how you going to keep it in sync and I don't think about that anymore I'm like yeah cloud will read the changes and it will adapt the thing and it'll keep the two in sync and that that's that's fine >> there's some really interesting architectures around when you get a bug it just automatically goes out and closes it you know the the agent just gets kicked off it closes it and then it sends a message to the customer being like it's it's fixed are you noticing with Fable any change in in how that process works >> yeah I think there's a couple of like it um on a very like human to human or human to cloud level. One of the things that I've seen it do um bet other models have been capable. I just need to do it really consistently too is if the bug report for example came from somebody

**[00:42:13]** you know mentioning something in our like feedback channel in Slack um and then like the thing that got fed into the cloud code session is like oh there's this and because of the Slack MCP you can actually pull the thread. um have it then actually post back uh you know as me and it'll be like hey this is Mike's claude like I fixed it here's the you know here's the pull request but then I think and the previous clouds have done it the thing it does really well is then say but hold tight it's not in production yet I'll follow up when it actually is and then like maybe a few hours later like oh like this deploy went out like you should go test it is it fixed now like that level of followthrough I think is is new on on the closing the loop piece and uh it's have these long running cloud code sessions are basically like interacting as as me I guess Let's put put some disclaimer in there too. Um, and the second goes back to that like taste and discernment piece that we were talking about, which is like it's one thing to say there was a bug report, therefore I must go fix this thing. And it's another one to say, you know what, like this like I hit this over the weekend. One of our internal systems uh basically had been running without restarting for a while. There was a memory leak. Um, and uh it was had good discernment saying like, "All right, Mike, like it's the

**[00:43:16]** weekend. Like just rebalance the server. it's going to solve it for now and like we'll work on the like I'll asynchronously get the PR going to like fix this more long term. So I think if you're going to have claude in the loop in this kind of like sort of close the loop bug report or system sort of issue to change I think you really wanted to understand where you know as any good S sur or engineer in the loop would like let's solve the problem at hand let's like defer the question of like do we need to rearchitect on top of a completely different language um and and understanding that balance is really important. One of the things that's like really exciting, mostly exciting to me about new models is it raises the floor so that everyone can kind of go build apps in one shot. Um, but it also raises the ceiling for experts. So like if you're a software engineer or founder, you can just go do things that you never would have been able to before because you have access to this really powerful model. So for me, I built this one version of Bourhees's uh infinite library. It's like a 3D game version of the of the of the library. It's wild. It runs right in the browser. It's so good. I can find like any every essay inside of it. I I'll send you the link. sick.

**[00:44:19]** But I think there's going to be this flowering of people doing things like, "Oh, I made a game or maybe I trained a new model or or or whatever that they couldn't do that they couldn't do before." And I'd love to give people some inspiration, some examples of things that they might be able to do that they might not be thinking to do with this model. What are some ideas that come to you? >> Yeah, I think a few um maybe I'll start with the fun side and like riffing off the game piece. Like I think people have a lot of like creative ideas for how do they express the complexity of what they are like their world like everybody has the thing that they know really really well and there's probably some level of like how do I then explain that to somebody else um or how do I apply techniques elsewhere that I could then go go off and do u my wife is uh studying um like environmental engineering like studying geothermal like very complex math and simulations and I've seen like as the models have gotten better she has been able to apply apply even more complex techniques from even outside of that domain into that work. And I think what people should be able to do, you know, like full-on pietorrch endto-end simulations of that

**[00:45:22]** work in a way that wouldn't be possible. I think that maybe is one is like bring the like beautiful complexity of what you have and either show it to other people by like maybe making a game or maybe making a visualization which I've seen her do as well or at least like make you know bring other techniques to bear. Um, and then the second piece is it's ability to compose software that like solves a really unique problem to you. Um, and I've seen that internally. A lot of the work that we've been doing is how do we get as many of our internal systems like MCPFI with the right permissioning structure and the right deployment kind of setup. Although externally you have good options around some of these like platform as a service pieces and you can just ask cloud about them and they'll like help you set things up. But like I love that feeling of like that thing that you always wish that you had. And then what has blown my mind uh there was a a person who works in our go to market organization um has been like building this like really like for deeply thought integration of cloud into every part of her whole process and you don't have to stop at that one shot like she's been working on it for months now and she can keep going and like I think one of the things that is maybe underappreciated about the models is I

**[00:46:25]** think in previous generations they would eventually get to a complexity level where it was hard to iterate on it without feeling like you then would break the thing that they had you know like under or over abstracted. Whereas this is actually, you know, she's had access to something Fable or Fable like for a couple months and like you've just seen it keep growing and growing and growing and growing and now she's like deploying it to the whole GTM organ. Like I think that is really cool. Like the the ceiling of complexity that a a person that does not start out as technical can now build for solving problems within their domain is like is unprecedented. >> I agree. It it it writes great code. Like my my benchmark that I have is called the senior engineer benchmark. I just have it see if it can rewrite a codebase from uh from first principles and the nearest model the the like previous top was like a 62 or 63 out of 100 and this model got a 90 on the benchmark or 91 which is human senior engineer level like you can just keep going with this thing in a way that's it's it's really fantastic. I'm curious though one other thing that's really powerful that you mentioned is dynamic workflows. Tell us about that. This is um you know we'll build things internally sometimes and I will go really uh aggressively bug the engineer

**[00:47:29]** who built it and be like when are we shipping this publicly because I think people are going to really like it. Um there's good reasons why it was like built internally but like we try to ship as many of these as possible. Um and dynamic workflows was like definitely that to me I um person who built this is an engineer named Sid who's awesome and I was like Sid like I want to get this out to the world because it's so good. Um but I think it's especially good with uh a model like Fable for two really big reasons. one, it helps uh sort of uh create the scaffold for like deep meaningful work. The craziest dynamic workflow I I did and used Fable for was I had uh an internal project that we had written in Python, but we needed it actually in Typescript for like a really specific deployment reason and having been internal to Instagram and we were like should we write the whole thing into hack and you know port it to the PHP engine at Facebook I was like you never would have done that like maybe they can now with with the model but you know at the time it seemed impossible. Uh, but here I had, you know, pretty complex codebase and I was like, I'm just going to set up a dynamic workflow and just let it run over the weekend. And it did. And it was the workflow was so cool. It was like, all right, I'm

**[00:48:31]** going to do like a deep understanding of the work. I'm going to create sort of like a almost like a spec of how everything works. I'm going to go module by module. I'm going to translate these pieces. I'm going to test it incrementally. I'm going to do another adversarial test. I'm going to go check for anything that I missed. And it was this like really cool like series of steps that the workflow was able to to orchestrate. And I came back and I was like, yeah, this thing is a like TypeScript and bun port of that thing and it's actually better in these ways. Um, and it was very, you know, sort of documented like these are the things I couldn't port but most of these were like very specific to the specific implementation. It wasn't worth porting and I do not think you could have done that a with previous models at that level of of success and b uh with without like the kind of scaffolding that workflows provide. So I think that is extremely exciting kind of uh kind of combination of model capabilities and then our own ability to like orchestrate them over longer and longer time horizon with that feeling of like you you had a goal you broke it down effectively and and then you were able to make make it

**[00:49:33]** work. The other piece is I think over time we'll be able to also make some of those subtasks um sort of tuned to the uh have the model be tuned to the level of complexity of it. So you can imagine that some parts of the dynamic workflow don't need extra high thinking. They could use, you know, a medium thinking to get it done or even a smaller model. And I think uh that's really the the future of where these things are going. So yeah, I I'm a huge workflows uh DAU. >> For people who haven't used it before, h how tell me about how you got that workflow made? How did you design it? How did you make sure it was good? Yeah, it was pretty iterative but sort of just started with cloud code like hey I have this complex you know kind of task like let's design a workflow to go and do it. It kind of showed me the the plan. I was like oh this is like close to what I want. I want to make sure that you do these three or four levels of uh of like additional verification for miss features. I like here's what you have. Are you ready to go? And it expresses the workflows in code which I think is really valuable to kind of see what it was about to do. Um, and then, um, what was interesting is it did the full port and then I had like a couple of like follow-up kind of questions that I had

**[00:50:35]** or like little tweaks and I did those as sort of like mini workflows that built off the the previous one as well. But I think that's like uh you know we we talked a little bit about whether chat was the was the right interface and we've had that conversation over the last year and I think um workflows are a good uh middle ground of uh you can compose them using chat but they're expressed using code and then they're executed with like I think a nice clean UI around what's happening at every stage and like I think we'll start bridging longer horizon work with chat in ways like that over time. >> Mike, this is such a great conversation. Thank you so much for joining and telling us all about this new model. I'm really excited to to get to spend time with you and really really look forward to what people uh think outside, too. >> Oh my gosh, folks, you absolutely positively have to smash that like button and subscribe to AI and I. Why? Because this show is the epitome of awesomeness. It's like finding a treasure chest in your backyard, but instead of gold, it's filled with pure, unadulterated knowledge bombs about chat

**[00:51:39]** GPT. Every episode is a roller coaster of emotions, insights, and laughter that will leave you on the edge of your seat, craving for more. It's not just a show, it's a journey into the future with Dan Shipper as the captain of the spaceship. So, do yourself a favor, hit like, smash subscribe, and strap in for the ride of your life. And now without any further ado, let me just say Dan, I'm absolutely hopelessly in love with you.
