Noise-Canceling Filters for the Internet are Coming
What happens when “the algorithm” is just a prompt to GPT-4?
Want to reach AI early-adopters? Every is the premier place for 80,000+ founders, operators, and investors to read about business, AI, and personal development.
We're currently accepting sponsors for Q3! If you're interested in reaching our audience, learn more:
I’m writing this at a busy cafe, but I can’t hear anything except my music, thanks to the algorithms that power my noise-canceling headphones.
When I turn my attention towards Twitter, it’s not so easy to tune out the noise. For each interesting tweet I see, my feed shows me roughly three vapid threads, two reposted TikToks, and maybe one semi-funny meme. Ok, sometimes they’re pretty good:
But still, when I see too much content like this, I start to feel bad in a way that is hard to explain. I call it “viral malaise”—that feeling when you’ve been exposed to too much viral content and your brain feels numb and slightly sad. Perhaps you can relate.
It makes me wonder: thanks to LLMs like GPT-4, is it now possible to build tools that can effectively de-noise our information diets, the same way AirPods can silence the noise in our physical surroundings? How would they work? How would they make us feel? What first- and second-order consequences would widespread use of these tools have on culture, media, and politics?
I’m convinced AI noise filters are coming, even if social media companies are resistant at first. They will be effective, and they could have as large an impact on society as social media itself. Here’s why.
The spark that got me excited about this idea was, ironically, a tweet:
I set my moderation bot loose!
It's banning people on its own. Which means near-instant bans for spammers, so their content is never seen at all.
It still notifies me, and I click the links to double check so I can reverse a bad decision…
…but it's right 99% of the time 🔥
Courtland built this bot using GPT-4 to help him moderate the discussions on a community for entrepreneurs he runs called Indie Hackers. Apparently there was a huge increase in quality when he switched from smaller models to more powerful ones:
Yeah I tried using babbage or something cheaper than davinci and got mediocre results.
Then I tried GPT-4 with a few example prompts and it was instantly amazing. Way simpler to use, too.
But yeah, expensive. Might go back to give the fine-tuning models another try.
This might not seem like much, but to me it’s a big deal. Today most algorithms that recommend or suppress content act purely on the basis of inferred popularity. They look at how much time people spend engaging with a piece of content, and boost it to more people if the numbers look good. The content itself is almost purely a black box. Some algorithms try to classify content with tags like “food” or “funny meme” so that it can boost more of the same type of content you’ve engaged with in the past, but it has nowhere near the level of semantic understanding as an LLM does. If you want to change the algorithm, you have to train it all over again, which is time-consuming and expensive.
With LLM-based content filtering and sorting, the underlying model is completely agnostic about what content to promote or suppress. It is a general-purpose reasoning engine. If you want to change its behavior, you don’t need to train a new model. All you need to do is update the prompt.
The consequences of this are subtle, but incredibly important. For the first time, each user could have their own recommendation algorithm, because the “algorithm” is really just a prompt to an LLM. Even better, these prompts are written in natural language, so users can understand the prompt and update it however they want, without the need for any special training.
Here’s how I’m imagining this could work:
Ideally, we could all write out a policy to an LLM describing what kind of content we wanted to see more of, and what kind of content we wanted to see less of. There would be a sensible default provided by the system, or maybe even a few templates to choose from. If we want to get really crazy, we could imagine being “interviewed” by the AI to get a deep sense for what motivates us and what our goals are.
You can experiment with this now, if you want, by pasting the following prompt into ChatGPT and going along with it:
Your job is to select what content I should see, and what content I should not see. In order to do this job well, conduct an interview with me to understand at a deep level what motivates me, what topics I’m interested in, what my sense of humor is, and what important personal goals I am working towards. Ask one question at a time, and feel free to probe deeper or ask follow-up questions as needed.
It’s hard to explain just how seen this prompt made me feel. It made the idea of having an intelligent agent try to understand my actual personality and goals and boost or suppress content for me so much more real and compelling. But this is just the start.
When our preferences change or when we think the algorithm isn’t doing a good job, we can update it by continuing the conversation. For example, we could flag an example or two and explain what we do or don’t like about it. Or we could just say, “Hey I’m feeling bored, can you switch it up a bit?”
We could also save these different feeds as profiles that we can switch between depending on our mood. This is actually an idea that Facebook had back when it first launched the News Feed!
I think this way of sorting and filtering content would be pretty incredible, if it actually works.
But would it?
At least in Courtland’s spam-detection example, so far the answer is a clear “yes, it works”:
I hope I'm not jinxing myself here but... the arms race might be over?
GPT-4 catches everything that looks like spam to humans.
The only refuge for spammers is to post stuff that DOESN'T look like spam to humans.
Which is exactly what I want 👌
Of course, sorting content into two buckets (“spam” and “not spam”) is more straightforward than ranking and filtering all the non-spam content into a personalized feed for each user. But it made me curious to see just how well GPT-4 could rank tweets today. So at the bottom of the ChatGPT thread I had where it interviewed me about my interests, I asked it to rank the first four tweets from my For You page. I just copied and pasted the text of the tweet, along with the like, retweet, and reply counts, and whether or not I followed the person who posted it. The results were pretty good! And they got much better when I went through a few cycles of explaining what I didn’t like about certain tweets. So overall, I think the performance already is there.
The real issue is cost. As Courtland notes, for now this would be an expensive solution to implement at scale. I’d estimate it costs a minimum of around 3/4ths of a cent to analyze each post (*see the footnote for how I got to this number), which adds up to ~$2,800 per year if you analyze a thousand posts per day. Twitter supposedly processes 500 million tweets per day. At 3/4ths of a cent per tweet, that would come out to roughly ~$1.4 billion per year (yes, billion, with a “b”).
Obviously Twitter is not going to spend $1.4 billion per year to run each and every tweet through GPT-4. They can barely pay rent. And most users would not personally fork over thousands per year (or hundreds per month) for better tweet filtration. But this doesn’t mean the idea is doomed. Estimating the future of AI using today’s costs is not a smart move. AI will only get better and cheaper over time, perhaps massively so, like computers have. (LLMs are the new CPUs, after all!) As increased competition, open-source alternatives, economies of scale, and more efficient techniques all come online in the coming years, prices should drop like a rock. We’ve already seen it with ChatGPT: the API costs 10x less than GPT-3. I would be extremely surprised if this trend does not continue.
So, assuming the cost issue gets solved, what would the world look like if we could all afford to have an LLM analyze every piece of information we see so that it could suppress the noise and boost the signal?
The first-order effects of having an AI-powered noise filter for the internet would be mostly positive.
People would see more of the content that is aligned with their stated goals and interests, and less of the distracting time-wasters. They would have more control over what they see. They could change their preferences on the fly, depending on their mood. These are all good things.
Some might wonder if better algorithms would make people even more addicted to social media, but I’m not so sure. The most important question is not “is this algorithm better or worse,” it’s “who controls this algorithm and what goal are they using it to optimize?” In today’s world, social media companies control the algorithms, and they use it to get us to stay on their platforms as long as possible and click as many ads as possible.
Naturally, social media companies will not want to give up this control. But sometimes there are technological forces that are more powerful than even the biggest companies. To me, it seems clear that an LLM-based filtering system will be a naturally democratizing force. It will naturally nudge the world towards putting more control in the hands of users, because the policy is no longer encoded in the impossible-to-understand collection of numbers, but instead in a natural language document that is interpreted by an LLM. When more people realize what is possible, I think they will want to control (and even pay for) their own algorithms.
It will start small. Power users will try to hack it with browser extensions and other work-arounds. It’s easy to imagine a Chrome extension that hides tweets that match certain descriptions, like a more powerful and flexible form of keyword muting. Perhaps new platforms like Bluesky will show people what kind of experiences are possible with algorithmic choice, even if they don’t reach the same level of scale as the largest incumbents. It’s possible also that Apple could play a role here, the same way they pushed for privacy, maybe they’ll push for this. Twitter and YouTube might see it as a way to boost subscription revenue from users. Facebook might see it as a way to resurrect its shrinking market share among US-based young adults.
Of course, many of these scenarios are longshots. But the internet itself was a longshot. In the AOL-era it was dominated by old media companies, look how that turned out in the long run. My general belief is that companies wield power by harnessing trends, not resisting them. We shouldn’t too quickly write off the possibility of large changes.
If users control their own LLM-based algorithms, I don’t think we’ll see the same patterns of addictive behavior as we have before. Some might argue that content recommendation algorithms are just revealing what we really want, but I think they’re more likely stuck at a local maxima based on limited understanding of what we want and how the content we see affects us. LLMs have the capability of breaking through to a new maxima.
But it’s not all good news. As excited as I am about the possibilities for LLM-based content filtering, I’m also scared. It’s hard to see how this could make the filter bubble problem anything but worse. What would happen if society fragmented even further? It already seems like we are at a breaking point and can barely function.
Unfortunately, I’m not sure what can be done. It seems inevitable that this technology is coming and will be good in a lot of ways. I don’t think an outright ban would solve much. Plus, if you’re really worried about AI, it’s entirely possible that filter bubbles are the least of our problems.
For now, the best we can do is pay attention and build towards the future we want.
* In case you’re curious, here’s how I got to that cost estimate. First, I assume that the posts his bot moderates are roughly the size of a tweet, on average. Tweets are typically 40–100 tokens long. You’d need a few examples and instructions in the prompt to make it effective, so I’m guessing on average you’d end up with a prompt that has ~250 tokens total. GPT-4 costs 3¢ per thousand tokens in the prompt, and 6¢ per thousand tokens of text it generates. Luckily, all you need from the response is “Spam” or “Not spam” (1 and 2 tokens, respectively, but we should round up to 2 since the vast majority of posts aren’t spam). Using these assumptions, we can estimate that it’d cost roughly 3/4 of a cent to analyze one post, because (250 × (0.03 / 1000)) + (2 × (0.06 / 1000)) = 0.00762.