Everything OpenAI Launched at DevDay

A real-time API, prompt caching, higher rate limits, model distillation tools, and more

September 30, 2024 · Updated January 28, 2026

Was this newsletter forwarded to you? Sign up to get it in your inbox.

I’m at OpenAI’s developer conference, DevDay, today in San Francisco. Here’s what I saw.

The big news is that the company launched a Realtime API that promises to allow anyone to build functionality similar to ChatGPT’s Advanced Voice Mode within their own app. Paired with their new model o1, released a few weeks ago, OpenAI is creating an new way to build software.

o1 can prototype anything you have in your head in minutes rather than months. And the Realtime API enables developers to build software with a novel interface—life-like voice conversations—that was previously only the domain of science fiction.

OpenAI also announced the following:

An increase of the API rate limits on the o1 model equal to that of GPT-4o (10,000 requests per minute)
A reduction of the price of GPT-4o API calls with automatic prompt caching, making repeated API calls 50 percent cheaper with no extra developer effort
A multi-modal fine-tuning API that allows developers to fine-tune GPT-4o with images in addition to text
Triple the number of active apps are on the OpenAI platform from last year to this year, and there are 3 million active developers.

Let’s get into the details.

o1 victory lap

OpenAI released its new reasoning model o1 two and a half weeks ago, and the company’s excitement about it was palpable in the room. OpenAI’s head of product, API Olivier Godement described it as a new family of models, distinct from GPT-4o, which is why they reset the number on the model back to one.

At the beginning of the AI wave there was a lot of talk about what artificial general intelligence would look like: Would it be one model to rule them all—GPT-5 for all comers—or would there be different models for different purposes? Today OpenAI told developers that they would be investing heavily in both next-generation GPT-4o and o1-type models. The company is betting on a diversity of models for different use cases as the way forward.

o1 models excel at reasoning—which OpenAI defines as being able to think in a chain of thought format—which makes them better at tasks like programming, but slower and more expensive. These models will be used for tasks that require more advanced reasoning, but they won’t become the default model choice because most prompts won’t require it.

The increase in programming power was on display today. Romain Huet, OpenAI’s head of developer relations, did a live demo where he used o1 to build an iPhone app end-to-end with a single prompt in less than 30 seconds. He also demonstrated building a web app to control a drone that he brought on stage, and used it to pilot the drone for the audience.