Vibe Check

Taste-testing new models.

Nov 19, 2025

Vibe Check: Gemini 3 Pro, A Reliable Workhorse With Surprising Flair

After 24 hours of hands-on testing, we found a model that’s fast, reliable, and surprisingly funny—but still prone to overreaching and not yet a writing champ

Jul 18, 2025

Vibe Check: Grok 4 Aced Its Exams. The Real World Is a Different Story.

The smartest model isn’t always the most useful one

Sep 15, 2025

Vibe Check: GPT-5 Codex Can Code for 35 Minutes Straight—If You Ask Nicely

It launches today—here’s our day-zero vibe check

Nov 25, 2025

The AI Browsers That Made It Into Our Daily Workflow

Switching browsers is a pain. Here are the ones that our team deemed worth it.

May 9, 2025

Vibe Check: Gemini 2.5 Pro and Gemini 2.5 Flash

Why Google might quietly win the race to be AI’s top backend provider

Oct 15, 2025

Vibe Check: Anthropic Cooked on Claude Haiku 4.5

This one’s for the developers

Dec 11, 2025

Vibe Check: GPT-5.2 Is an Incremental Upgrade

OpenAI's latest model update excels at instruction-following and extended tasks, but don't expect it to surprise you

Apr 17, 2026

Vibe Check: Opus 4.7 Stopped Reading Between the Lines

Anthropic's latest Opus is more precise, more literal, and the best coding model we've tested on well-specified tasks—but it won't fill in the gaps for you anymore

Feb 2, 2026

Vibe Check: OpenAI’s Codex App Gains Ground on Claude Code

OpenAI nailed the interface. But it's built for hardcore engineering.

Jan 13, 2026

Vibe Check: Claude Cowork Is Claude Code for the Rest of Us

The asynchronous, agentic workflow developers love is finally accessible to everyone—but the polish isn't there yet

Feb 18, 2026

Vibe Check: Anthropic Just Made Opus Cheaper Without Calling It That

Sonnet 4.6 delivers Opus-close performance at half the price—but speed didn't come along for the ride

Oct 16, 2025

OpenAI Made Video Creation Effortless—Here’s What Happened Next

Sora 2 removed every creative barrier, but our feeds tell a different story about human imagination

May 28, 2026

Vibe Check: Opus 4.8—Anthropic Should’ve Rounded Up to 5

Opus 4.8 tops both our Senior Engineer benchmark and our writing tests. It’s the most complete model we’ve tested. We just wish it had an app to match.

Apr 23, 2026

Vibe Check: GPT-5.5 Has It All

OpenAI’s new model is a top-end senior engineer—and easy to talk to

Feb 5, 2026

GPT-5.3 Codex vs. Opus 4.6: The Great Convergence

We’ve tested both models thoroughly—here’s our head-to-head Vibe Check

Mar 5, 2026

Vibe Check: GPT-5.4—OpenAI Is Back

GPT-5.4 is fast, opinionated, and good enough to tempt our Opus loyalist

Aug 7, 2025

GPT-5

Our hands-on review of OpenAI’s newest model based on weeks of testing

Apr 2, 2026

Vibe Check: Cursor 3.0 Bets Big on Agent Orchestration

The AI-native IDE is now becoming an agent-orchestration tool. Will it work?

Feb 5, 2026

Vibe Check: Opus 4.6—The Best Coding Model We’ve Tested (With Some Maddening Habits)

It one-shotted a problem other models missed—and brings agentic, parallel work to non-coding tasks

Feb 5, 2026

Vibe Check: GPT-5.3 Codex—The 10x Engineer, Now More Fun at Parties

The autonomy we wanted is here—but the model still does what you say, not what you mean

We use analytics and advertising tools by default. You can update this anytime.