I Built an AI Chatbot Based On My Favorite Podcast
Here’s how I built it and what I learned about the future
Sponsored By: Reflect
This article is brought to you by Reflect, a beautifully designed note-taking app that helps you to keep track of everything, from meeting notes to Kindle highlights.
Update: if you want to use the chat bot mentioned in this article, it's available for Every subscribers here:
In the future, any time you look up information you’re going to use a chatbot. This applies to every piece of information you interact with day to day: personal, organizational, and cultural.
On the personal side, if you're trying to remember an idea from a book you read, or something a colleague said in a meeting, or a restaurant a friend recommended to you, you’re not going to dig through your second brain. Instead, you’re going to ask a chatbot that sits on top of all of your notes, and the chatbot will return the right answer to you.
On the organizational side, if you have a question about a new initiative at your company, you’re not going to consult the internal wiki or bother a colleague. You’re going to ask the internal chatbot, and it will return an up-to-date, trustworthy answer to you in seconds.
On the cultural side, if you want to know what your favorite podcaster says about a specific topic, you’re not going to have to Google them, sort through an episode list, and listen to a two-hour audio file to find the answer. Instead, you’ll just ask a chatbot trained on their content library, and get an answer instantly.
This future may seem far out, but it’s actually achievable right now. I know, because I just built a demo over the weekend. And it already works.
. . .
Reflect is a fast note-taking app designed to model the way you think. Use it as a personal CRM, as a way of taking meeting-notes, or just generally to keep track of everything in your life.
Reflect has integrations into all your favorite tools (calendar, browser, Kindle), so you can always find what you’ve read and researched. We work online or offline, desktop or mobile.
Think faster and clearer with Reflect.
I love listening to the Huberman Lab podcast, a neuroscience podcast by Stanford neurobiologist Andrew Huberman. It’s got a good mix of interesting science and actionable advice to improve how your brain and body operate.
Unfortunately, the episodes are also very long—each one averages a few hours. I often have a specific question related to something that he's already covered, but it's a pain to go back and try to find the answer by scrubbing through old episodes.
So I made a solution over the weekend: a Huberman Lab chatbot using GPT-3.
I made a chatbot based on
Ask it any question and it will scan through Huberman Lab podcast transcripts and return the answer using GPT-3.
(You can even ask it if Cell Press journal is high quality or not....😉)
It was simple to build, and it can already answer questions plausibly well. I can ask questions about topics that the podcast has covered in the past, and it answers them using transcripts of old episodes as an information source.
Here’s a quick example. Dr. Huberman recently did an episode on the science of creating and maintaining habits. One of his suggestions is to do “task bracketing” to help create habits—a phrase I’ve had in my head to try since I listened to the episode, but I can’t remember exactly what it means.
Usually, I’d have to go and re-listen to the episode to answer the question. But instead I asked Huberman bot:
Me: Can you describe how to do task bracketing?
Huberman bot: Task bracketing involves dividing the 24-hour day into three phases: phase one (0 to 8 hours after waking up), phase two (8 to 16 hours after waking up), and phase three (16 to 24 hours after waking up). Place particular habits at particular phases of the day in order to engage the task-bracketing mechanisms. This will make it more likely that you will be able to regularly engage in these habits and activities over a long period of time. Additionally, positively anticipate the onset and the offset of the session and reward yourself for the entire experience. This will help shift your nervous system toward being able to predict that certain things are going to happen at particular times of day.
Me: Why do I divide the day into three phases when I'm doing task bracketing?
Huberman bot: Dividing the day into three phases when doing task bracketing is useful because it allows you to leverage neural systems in order to help you make it more likely that you're going to be able to engage and maintain a particular habit. Phase one (0-8 hours after waking) is useful for acquiring certain habits, phase two (9-14/15 hours after waking) is useful for acquiring other types of habits, and phase three (16-24 hours after waking) is useful for calming activities.
Me: What kinds of habits are better acquired during the first phase of the day?
Huberman bot: Habits that require a lot of activation energy and override of limbic friction are better for acquired during the first phase of the day.
This is pretty good for a weekend’s worth of work. To be sure, it still leaves a few things to be desired. For one, it gets things subtly wrong. For another, sometimes it’s not specific enough to answer the question, and I have to ask follow-ups to get the right answer.
But it’s close enough, and the rate of improvement of this technology is steep enough, that I think we’ll see stuff like this flourish over the next year or so. It’ll be a big change for the way content is produced and consumed—and it presents massive benefits for people who consume content, and massive opportunities for people who create it.
How Huberman Bot works (and when it fails)
The principles behind the Huberman bot are simple:
- It ingests and makes searchable all of the transcripts from the Huberman Lab podcasts.
- When a user asks a question, it searches through all of the transcripts it has available and finds sections that are relevant to the query.
- Then, it takes those sections of text and sends them to GPT-3 with a prompt that looks something like:
Answer the question as truthfully as possible using the provided context, and if the answer is not contained within the text below, say "I don't know."
[ relevant sections of Huberman Lab transcripts ]
Q: What is task bracketing?
GPT-3 is smart enough, given a question and relevant source material, to answer that question based on the latter. I built it mostly using this code example from OpenAI with a bunch of custom modifications for my use case. It took probably a weekend of effort.
The problems, though, are obvious.
For one, the length of the prompt you can send to the model is capped at 4,000 tokens—where a token is roughly equivalent to ¾ of a word. So you’re limited in terms of how much context you can feed into GPT-3 when asking it a question. You have to hope that your search algorithm (in this case a vector similarity using OpenAI’s embeddings search) found the most relevant pieces of transcript such that the answer to the question exists in what you’re providing to the model.
This often works, but it fails just as frequently. The bot is subtly wrong a lot or is not specific enough to fully answer the question. For example:
Me: What is reward prediction error?
Huberman bot: Reward prediction error is the way in which people navigate rewards and learning, including habit formation. It is the release of dopamine in the brain and body when something happens that is positive and novel. If you expect a reward and the reward comes, a particular behavior that was associated with generating that reward is more likely to occur again. If you expect a reward and the reward does not come, the pattern of dopamine release will follow a particular trajectory.
This is plausible but wrong in a few ways. For example, it doesn’t directly answer the question because it misses something key about reward prediction error: it’s about the difference between the reward you predicted and the one you received. The greater the difference, the more rewarding (or deflating) the experience is.
It’s easy to dismiss this technology given these shortcomings. But most of them are immediately solvable.
The answers will get a lot better if I clean up the data used to generate them. Right now, they’re based on raw transcripts of podcast episodes. When humans talk, they don’t tend to talk in crisp sentences so the answers to a lot of the questions I might ask are spread out around the episode and aren’t clearly spelled out in the transcript. If I cleaned up the transcripts to make sure that, for example, every term was clearly defined in a single paragraph of text, it would make for much better answers.
Another method to try is to chain GPT-3 prompts together in order to check and refine the answer. For example, once I get back a response from the Huberman bot, before returning it to the user I could send it back to GPT-3 saying something like, “Here is a response returned from a chatbot. Write out its argument and find any factual or logical errors. Then correct them.” This could work as a filter for bad answers—and once GPT-3 can access the internet, this type of answer-checking service would become phenomenally good.
Beyond improving the content of the answers, there are tricks you can pull to make the answers useful even if they’re wrong. If every time it answered a question it told me its source—e.g., where in the episode I could go to find more information—it wouldn't matter as much if the answer was vague or slightly wrong because I could check its work. This is eminently achievable with the current technology.
And whatever isn’t solvable in the short term will be solved soon. Case in point: in between building this bot and writing this article, OpenAI released a new version of its embeddings search that will significantly improve the results and lower the cost to get them by 99%. The pace at which all of this is moving is mind-blowing.
I’d like to release the bot publicly, but I want until I improve the quality of responses. Until then, it will be available for Every paid subscribers. If you want to try it out, become a paid subscriber. You’ll get access within the next week.
Chatbots as a new content format
Being able to easily turn any corpus of text into a reasonably good chat bot is a big deal. for readers and it’s also a huge deal for content creators. It also has significant—and positive—business implications.
To start, it means that there’s a new way to monetize any existing set of intellectual property. You might not pay to access a back catalog of all Huberman Lab episodes—but if that back catalog was reformatted to be a chatbot that could instantly answer your science questions, there’s a good bet you’d input your credit card. The same is true for all sorts of writers, podcasters, and YouTubers across the content creation spectrum.
In the future, anything that’s used as a reference should become a chat bot. Wirecutter, Eater, and more should all be accessible this way so that when I have a product I want to buy, or I’m in need of a restaurant to visit I don’t have to scroll through a list of articles with lots of options. Instead, I can just ask, “What’s a good place to eat in Fort Greene tonight?” and get a response that’s based on the latest Eater reviews in my neighborhood.
There are hundreds of thousands of copyrighted text, audio, and video corpuses that can be valuable chatbots today—all that’s needed is someone willing to make them. Even if you can’t code, there’s a valuable opportunity to buy the rights to turn this information into chatbots and sell them later to a bidder who can.
This doesn’t just extend to bodies of text that are intended to be read by the general public. For a long time I’ve been enamored with the idea that every company should have a librarian—someone who is tasked with writing down tacit knowledge in the organization, making sure all documents are up to date, and answering repetitive questions. It would save a lot of time for employees, reduce unnecessary confusion, and enable different parts of the organization to collaborate more effectively.
Here’s a slide from a deck I built in 2020 with a friend as we were considering whether or not to work on this problem as startup:
It turns out, that’s probably possible right now. You could have a chatbot that answers questions by sourcing information from the right person or document, makes sure documents are up to date, and proactively records tacit knowledge into living documents by periodically interviewing key people about their progress. I’m sure there are smart teams working on this already, and I’m excited to see which ones launch and what kinds of interactions end up working.
Where power settles in this ecosystem
One of the questions that arises for me when I think about all of this is: who’s going to win?
Is the future thousands of different chatbots all sitting on their own websites, all trained on their own text corpuses? Or are we all going to just use ChatGPT? Or is it something else? Where does power sit in an ecosystem like this?
I think power will settle in at least four places:
- The operating system layer
- The browser layer
- The layer of models that are willing to return risky results to users
- The copyright layer
Operating systems and browsers will have a lot of power because they’ll be able to sit between the user and any other interaction with the internet. Once Chrome, Arc, or Mac OS has an intelligent AI assistant that can perform tasks for you given a few sentences about what you want, your desire to go to a website that can do the same thing will go down tremendously. Operating systems and browsers also have the advantage of being run on your computer, so the integration will be more seamless and they’ll have access to lots of data that web-based AI applications aren’t going to have access to—so they’ll likely have better abilities to complete tasks.
But the disadvantages of being at the operating system or browser layer is that you’re going to have to serve a large volume of users. As I wrote in "Artificial Unintelligence," this will force these kinds of models to have self-imposed limits about what kinds of tasks they’re willing to perform and what kinds of results they’re willing to return to the user. This creates room for other players that are willing to return answers from corners of the latent space that are riskier (legally, morally, or in terms of brand alignment) than more general models.
Finally, I think there’s going to be a lot of power and opportunity for copyright holders in this new era. For now, people are training AI willy-nilly on lots of copyrighted material. But over time, I think large copyright holders (like Disney) will fight back in the same way the record industry did against Napster. I don’t know where it will net out, but I’d bet on copyright holders benefiting financially when their IP is referenced by these models in a way that still allows developers to build them, and users to benefit from them.
It’ll be exciting to see how this plays out. In the meantime, it’s incredibly fun to play with projects like the Huberman Bot. I have a bunch of these experiments in the works, and I’ll be writing more about them in the coming weeks and months.
Paid subscribers will get access to these before everyone else (and to stuff I won’t end up releasing publicly). So if you're not one already, make sure to become a paid subscriber:
Thanks to our Sponsor: Reflect
Think faster and clearer with Reflect, a beautifully designed note-taking app. Reflect is built so that you never lose track of all notes, meetings, kindle highlights, calendar, web snippets and more.