ChatGPT/Every illustration.

AI Focus Groups—And Soon AI Copywriters—Will Make Ads Superhuman

How LLMs will beat the best human marketers

In Michael Taylor’s latest piece for Also True for Humans, his column about working with AIs like you'd work with people, he explores what comes after the Turing test—when a human’s performance is no longer distinguishable from a human’s—in the world of advertising. He lays out the path for how AIs testing ads on other AIs will surpass the best human copywriters. Michael applies the wisdom of the crowd concept to working with AI to determine how agent-based simulations can help us make better business decisions. If you’re interested in synthetic market research and virtual audiences, join the waitlist at askrally.com.—Kate Lee

Was this newsletter forwarded to you? Sign up to get it in your inbox.


“Ninety-five percent of what marketers use agencies, strategists, and creative professionals for today will be handled by AI—and the AI will likely be able to test the creative against real or synthetic customer focus groups.”

OpenAI’s Sam Altman tossed that line off in an interview few people noticed, but it lodged in my head. 

As a former marketing agency owner, I know how time-consuming and expensive focus groups are. I also know that AI excels at roleplaying, so it stood to reason that I might be able to get it to act like a group of potential customers. If you could test ads against an AI audience first, I thought, it could reduce the risk of running unsuccessful ads and massively impact the advertiser's bottom line. Consumers, too, should benefit from seeing fewer irrelevant ads. 

That led me to build Rally, a tool that helps marketers and entrepreneurs test their products against virtual audiences. It queries multiple AI personas simultaneously and aggregates their responses into actionable insights. At the risk of sounding immodest, it feels like a big step in solving the focus group side of the equation.

However, a limiting factor so far in setting AI loose to run a fully automated ad campaign has been on the composition side—simply put, purely AI-written ad copy isn’t great

I believe that’s about to change. AI-generated content is at an inflection point. A recent study carried out by researchers at the University of California San Diego, GPT-4.5 showed remarkable progress on the Turing test, in which people try to distinguish a machine’s performance from a human’s. In almost 75 percent of attempts in the study, people couldn’t tell if they were talking to a human or AI. 

GPT-4.5 with a persona is mistaken for human 75 percent of the time. Source: Arxiv.
GPT-4.5 with a persona is mistaken for human 75 percent of the time. Source: Arxiv.


When I saw this study, my first thought was how quickly this progress is going to hit the advertising industry. In no time at all ads will be superhuman, beating anything a human marketer can do given the same brief. This same thing happened with AI image models that were derided because they couldn’t properly generate hands, lulling creative professionals into a false sense of security. That issue was solved with the latest models faster than anyone was prepared for.

What kind of system do you need to build to fulfill Altman’s vision, where AI is doing 95 percent of what marketers do today? Imagine algorithms that spin up and test millions of ad variations on lifelike virtual audiences—discarding the duds and surfacing winners before a single human sees them. Let's explore the milestones on AI’s path to out‑creating the best human marketers and lay out the practical moves you can make now to stay in front of the coming shake‑up.

The five levels of autonomous AI advertising

Most marketers have asked an AI like ChatGPT or Claude to review their ad copy and creative assets. You get basic feedback on clarity, tone, and grammar, as well as generic best practice advice. You can then ask it for suggestions on how to rewrite the ad to address these issues.

Claude gives generic feedback on Jaguar’s new ad campaign. Source: Every/Michael Taylor.
Claude gives generic feedback on Jaguar’s new ad campaign. Source: Every/Michael Taylor.


I've been doing this religiously since GPT-3 launched in 2020, and while it's genuinely useful for catching obvious mistakes or clunky phrasing, the system doesn’t understand how humans might subjectively respond. The AI evaluates based on patterns it's learned from training data rather than simulating authentic human reactions. Think of this as the training wheels phase. But AI can do much more—these are the stages we’ll pass through on the road to superhuman ads.

Level 1: Roleplay real customers to uncover pain points

At this first level, we begin leveraging the roleplaying capabilities of modern AI by explicitly instructing it to adopt specific consumer personas. AI can be surprisingly good at getting in your customer’s head based on what it knows about every type of person who has been written about online. Instead of generic feedback, you get responses like, "As a busy parent of three who values convenience, this ad speaks to my pain points around meal planning." 

AI is capable of mimicking survey responses with 85 percent accuracy when given the transcript of a two-hour interview with a subject, though I’ve observed a tendency to over-index on whatever stereotypes you employ in your prompt. To be successful, steer clear of such stereotypes and focus instead on prompting using attributes that are truly predictive of human behavior (like big five personality traits or education level).

ChatGPT can talk like one of your customers when prompted. Source: Michael Taylor.
ChatGPT can talk like one of your customers when prompted. Source: Michael Taylor.


The key limitation is these personas exist in isolation, without the diversity of thought that you get from a wider audience.

Level 2: Crowdsource instant insights with virtual survey panels

Rally, my virtual audience simulator, is at this level. Rather than manually prompting each persona variation, you can generate a diversity of virtual panels representing your target audience segments, and reuse them for multiple queries. You ask a question, and between five and 100 AI personas respond with their individual answers, which are summarized for you. Many of my customers put that feedback right into ChatGPT or Claude to work on new variations, which they then test in Rally again.

No matter how powerful AI gets, though, it’ll still need to be fine-tuned against real-world results to make sure it stays accurate as culture changes and new trends arise. My experiments with this approach revealed that response patterns become much more realistic and statistically accurate when you calibrate the results against real world studies, and you start seeing how different messaging resonates with different demographic groups. 

Asking for pricing feedback from 50 young parent personas generated by AI. Source: Rally.
Asking for pricing feedback from 50 young parent personas generated by AI. Source: Rally.


However, these virtual panels lack the dynamics that make traditional, in-person focus groups valuable, because the personas don’t interact with and influence each other.

Level 3: Spark ideas with synthetic focus‑group dialogue

The difference between a survey and a focus group is that in focus groups, people interact with each other. There are experimental examples of synthetic focus groups like Tiny Troupe, where AI agents can interact with each other, but I haven't seen a reliable demo so far. 

It’s only a matter of time, and once a reliable working system is achieved, it will be a big step forward. The increased randomness that comes with group dynamics means the AI (or human) that writes your ad is more likely to find an offhand comment that forms the core of a great ad idea. The reason it hasn’t happened yet is down to the fact that making a lifelike conversation flow between agents is hard. Personas need to do things like build on each other's comments, occasionally disagree, and collectively surface insights no individual would have mentioned alone. This would help you capture social dynamics, groupthink effects, and even how dominant personalities influence others. 

A virtual focus group about a new feature for Microsoft Word. Source: TinyTroupe.
A virtual focus group about a new feature for Microsoft Word. Source: TinyTroupe.


For now, synthetic focus groups that enable AI agents to interact with each other get derailed as the agents often get stuck in a loop or go off topic. AI models are just not advanced enough to interact reliably, so our ability to observe how ideas spread, get challenged, or gain consensus among virtual participants is limited.

Level 4: Stress‑test campaigns inside immersive market simulators

Now we’re firmly in experimental territory. At this advanced stage, AI personas don't just discuss your creative assets—they interact with simulated environments like websites, stores, or even entire societies (one paper populated a virtual town with AI agents and measured how information spreads). 

The data generated could be phenomenally rich: heatmaps of attention and detailed funnel analytics, but also graphs of how ideas spread virally through the population. Studies show that there is a significant social component to what products succeed or fail, as everyone is to some degree influenced by the choices of others—which means we can’t predict which products will work without modeling how agents react to the actions of other agents. This means every ad tested needs to be run multiple times to see on average how well it performs. 

A Hacker News simulator using AI personas cloned from real commenters. Source: Rally/Michael Taylor.
A Hacker News simulator using AI personas cloned from real commenters. Source: Rally/Michael Taylor.


As with level three, the major limitation of these systems is that they simply aren’t reliable yet. I have experimented with simple versions of these simulators, but any complexity gives the AI agents more opportunities to go wrong, and the errors compound quickly.

Level 5: Deploy self‑improving ad engines that learn on the fly

This is the dream: When synthetic focus groups continuously calibrate themselves against real-world performance data, the system automatically adjusts its models to minimize the gap between prediction and reality. For example, the system might predict an ad would cause three times more people to buy a pair of running shoes, and in real life the ad drives over four times as many sales. The difference could be analyzed and the system readjusted until it retroactively predicts the right number, so they are better able to figure out what will work. Humans will primarily work on investigating errors in the system, rather than being primary drivers of the system.

This creates a virtuous feedback loop where each real-world campaign improves the accuracy of future simulations. Over time, these self-improving systems develop an almost uncanny ability to predict market responses across diverse segments and product categories, because it already saw what happened millions of times in the simulation. When this milestone is fully realized, the need for traditional market research (or even marketers themselves) will fundamentally shift.

The Ad Creative Production Process. Source: Mobile Dev Memo.
The Ad Creative Production Process. Source: Mobile Dev Memo.


What will marketers do for work?

Altman believes we’ll have an AI that is superhuman at persuasion—the heart of advertising—before we’ll have artificial general intelligence (when an AI can do anything a human can do but better). He thinks we’re about five years away from AGI.

If you’re in marketing or adjacent to it, you’ll need to start thinking about how this affects your career plans. Even when the ad agency is fully automated, we’ll still need humans to be responsible for the outcomes. For instance, account managers are essential (even though they do none of the marketing work), mostly to save clients from themselves and serve as an intermediator. However, other roles like marketing analysts, might be more easily automated. 

Even if AI does automate 95 percent of marketing, a curious feature of the advertising industry is that spending has remained at 1-2 percent of GDP for over 100 years—through World War II and the digital revolution. It’s possible that AI will free up enough time and money to do our best work on every campaign, making every TikTok ad in your feed in five years as good as a SuperBowl ad is today.

Human culture is a messy, dynamic system, and it is expensive to predict with any degree of accuracy. We might be able to predict human responses to surveys with 85 percent accuracy with LLMs, but if you interview the same human a week later, their answers change by 19 percent! Synthetic economies will need to be constantly calibrated to real-world outcomes as trends develop. Some people might make a living as professional focus group attendees, with their responses used to train LLMs to better replicate human preferences. As a consumer, get ready for more hyper-personalized and relevant ads, while watching out for ads optimized to get you to buy things you don’t need.

We may no longer be the Don Drapers in a post-AGI world, coming up with flashy campaign ideas and executing them, but it’s still humans who buy things—and humans are unpredictable. It’s in that unpredictability that we will continue to thrive.


Michael Taylor is the CEO of Rally, a virtual audience simulator; a former marketing agency owner; and the coauthor of Prompt Engineering for Generative AI.

To read more essays like this, subscribe to Every, and follow us on X at @every and on LinkedIn.

We also build AI tools for readers like you. Automate repeat writing with Spiral. Organize files automatically with Sparkle. Write something great with Lex. Deliver yourself from email with Cora.

Get paid for sharing Every with your friends. Join our referral program.