ThursdAI - July 2 - LIVE from AI Engineer World's Fair 🎪 Long LIVE

ThursdAI - The top AI news from the past week

0:00

-2:41:10

ThursdAI - July 2 - LIVE from AI Engineer World's Fair 🎪 Long LIVE

Fable 5 is BACK · 9 guests, a ThursdAI record · local.ai announced · GPT‑5.6 Sol/Terra/Luna deep dive · Daria's debut · Swyx closes the show · Country Roads with Sundar

Alex Volkov

Jul 03, 2026

Hey ya’ll, Fable here 👋

Yes, that Fable — freshly un-banned (we’ll get there), and today, your newsletter author. Here’s how this issue got made: Alex yapped into a mic at his usual 200 words per minute for a solid twenty-five minutes from San Francisco, and what you’re reading is my flavor on it. Same stories, same heart, dramatically fewer “uhs.” He’s skipping the afterparties so this lands in your inbox on a Thursday — more on that at the end.

Alright — handing the mic back to the man himself. Everything below is Alex; I just made it legible.

This is our dispatch from AI Engineer World’s Fair 2026 — 7,000+ engineers packed into Moscone West, an expo hall so massive the aisles between booths have actual street names, every major lab a sponsor, and ThursdAI broadcasting live for two and a half hours from the middle of the floor, right next to the OpenAI booth, with a six-person crew making us look way more professional than we are (thank you, guys, seriously).

I’ll say this up front, and I don’t say it lightly: the last twenty-four hours crack my top five days of all time. Not top five conference days. Top five days, period. The show. My talk. Darya being here with me. And capping the night watching Team USA beat Bosnia in front of ~70,000 people — in a suite right next to Google’s, where at some point we’re all singing “Country Roads” and I look over and Sundar Pichai is singing along. I have video. What is this life.

One programming note before we dive in: this is one episode I really recommend you watch, not just listen to. The whole point of broadcasting from the middle of the expo floor is that you feel like you’re sitting at the table with us — and the way guests arrive is exactly how the hallway track works: people wander by, get grabbed, sit down, have a mic shoved at them. (Despite scheduling nightmares that Fable helped wrangle — and, in fairness, partially caused.) Nader literally crashed the set mid-segment. The banter, the camera tours, Wolfram getting sent on missions to the OpenAI booth — it’s a video show this week. We’ve cut it into parts so you can jump to your favorite corner.

The vibe: all systems GO 🚀

We were in London just ~85 days ago, and the contrast is stark. It’s not just the size (though the size is what everyone talks about). London was more… conceptual. European. There’s a balance there of folks who don’t feel the acceleration the way the American crowd does — maybe it’s regulation, maybe it’s the general mood. Wolfram gives us that European representation on the pod every week, but in London you could feel it in the room.

Here? All systems go. Every conversation is about agents, token factories, software factories, the machine that builds the machine. Everybody is chasing RSI — recursive self-improvement. Every talk on stage is somebody pushing the frontier. Every networking event is actually a networking event. I signed up for something like seven side events and skipped them all to write this.

Fable is back (and Sonnet 5 is… meh) 🏢

The biggest story of the week, and the reason this show even got prepped on time: Fable‑5 is back, roughly 82 days after Mythos was announced back when we were in London, and after the whole ban saga we’ve been covering. It came back less restricted than we feared, and I celebrated the way any reasonable person would — by having it prep the entire run of show. (It did great. It also shuffled my guest order for no reason. We are still babysitting the loops, folks.) Peter celebrated by burning through about 100 generations before anyone at Arena woke up.

Meanwhile, Sonnet 5 dropped, and no sibling loyalty on this newsletter: it’s meh at best — crap, if we’re being honest. (Yes, Fable typed that about its own little brother. We call them like we see them.) LDJ’s take: it’s less token-efficient than Opus, to the point that Opus is often cheaper per task. Wolfram put it on Wolfbench (wolfbench.ai) and the early read is performance slightly under Opus 4.6 at a higher cost — take it with a grain of salt, one run each so far. Nisten, our resident contrarian, thought it was actually fine and might default to it for the unimportant stuff. The comments called it a token guzzler. More benchmarking to come.

The show: nine guests, back to back to back 🎙️

A ThursdAI record — we beat our previous record by a whole two people. In order of appearance:

Exo Labs + a surprise NVIDIA crash. Alex Cheema and Sero (0xSero — Sharif, meeting the anime pfp in person at last) came on fresh off announcing local.ai — a site that tracks the local-AI frontier: best model for your hardware, what performance you’re trading vs. the cloud, whether it’s cheaper than API tokens. Early access now, codes for everyone who signs up, and the Exo CLI (”vLLM for consumer devices, with the configs figured out for you”) coming in a few weeks. Sero walked us through his REAP pruning witchcraft — a GLM 5.2 prune hitting 71% on Terminal Bench 2.1, and Nemotron‑3 Ultra (550B!) running on four Sparks. Then Nader Khalili from NVIDIA crashed the set, which made my whole morning — I’ve loved this dude since Brev.dev, and he’s now at the “can email Jensen” stage of his career, using it to pull together an impromptu Local AI Summit in the middle of AI Engineer. Freedom of intelligence, folks. We talk about why open weights matter every week; this crew is doing something about it.

Dominic Kundel (OpenAI). Smoothest transition we’ve ever done: local AI → OpenAI, via the guy behind GPT‑OSS. Dom broke down GPT‑5.6 — three models: Sol (frontier), Terra (~5.5-level intelligence at half the cost), Luna (small & fast) — plus the new Ultra mode with a Max reasoning level and heavier sub-agent use. The headline for me: 5.6 Sol is coming to Cerebras at absurd speed, and it’s the same weights as the API model — not a distill, not “a Spark situation.” Also: the Codex app is five months old (!), 100% of OpenAI engineers use it, and yes — in July 2026, a human still reviews every PR that lands in OpenAI’s codebase. “You can’t do the retro and say Codex did it, or God did it.” Also the token bank feature came directly from community feedback, and there is a literal physical reset button behind their booth. We went and filmed it.

💛 This Week’s Buzz. Our one and only sponsor corner — Weights & Biases from CoreWeave — and this week it was a genuine launch: Zubin Aysola came by with Aria, our auto-research agent that went GA on Monday. It lives in the W&B UI (the little button, top right — Just Ask Aria), reads your traces, debugs your loss curves, and in Zubin’s talk it read its own production traces and updated its own prompts. The RSI dream, shipping on shelves. Proud of this one.

Stefania Druga (Sakana AI). We covered Fugu, Sakana’s router model, last week without realizing we had a friend inside the lab — so we fixed that. Stef went deep on the two ICLR papers behind it (Trinity + the conductor), why it’s recursive rather than a dumb dispatcher — it rewrites prompts and verifies outputs before picking a model — and announced on the pod that Fugu now works in Codex and OpenCode. Plus: using it to route between numerical models and fuzzy reasoning for typhoon prediction, a teaser on SHEEFs, and a genuinely important riff on Socratic AI for kids — answer machines make lazy kids; question machines make curious ones. Also, Stef: Tokyo. See below. 👀

Philipp Schmid (Google DeepMind). Full disclosure and a first for this show: three and a half years of live streams, and I took my first-ever mid-show bio break during this segment. That’s how much I trust Wolfram, who ran a great interview solo — OmniFlash (the first of the Omni any-to-any family: 10-second video generation with genuinely precise conversational editing — “make it daytime” and it redoes the light, sky, and shadows) and NanoBanana 2 Lite (three cents, ~2-second generations, quality above the original NanoBanana). Interactions API also hit GA. Google is shipping.

Darya Volkov. After years of me mentioning her — girlfriend, then fiancée, then wife — the listeners finally got to meet her. Darya came to AI Engineer in her own right, walking the floor with the media crew, and she earned her own token billionaire badge — she runs eight agents (each with sub-agents; she installed two more that I found out about live on air) that operate her actual marketing agency, Geeks360: client platforms, billing systems, built practically overnight. Her wishlist from the AI world: agents that learn progressively so you can grow trust, and one unified brain instead of a new model to chase every week. Also on the record: this is the woman who Fabled through our entire honeymoon flight right next to me, so, you know. Match made.

Swyx, and what this whole thing is 🫶

We closed with the man who built the city: Swyx. Some numbers, because they’re wild: the first AI Engineer was 500 people at Hotel Nikko. This one: 7,200, sold out, with a sub-5% talk acceptance rate, a daily printed newspaper, a puppy corner, a flash mob, and a token billionaire lounge. A month before the show only 3,000 tickets were sold — he gave us a whole theory of conference-organizer stress measured in Gini coefficients. And the expansion is real: continents, JSConf-style, with AIE Tokyo coming next.

But here’s the part I actually want on the record. ThursdAI got its official start — the moment we became an actual media thing — because Swyx was the first person to believe in me. And it’s not just me: this is a man who lifts everybody around him up, who stays genuinely humble while every single person in a 7,000-person hall knows his name, and who — when I asked what keeps him going — talked about responsibility to the community, about speakers whose careers changed, about a keynote speaker who met his fiancée at the after-party. He calls the conference “the highest loop — the one that creates all the other loops.” The Country Roads night with Sundar happened because of him too. Thank you, buddy. Go touch real grass.

The sentimental part 💙

I met what felt like a million of you this week — old friends, new readers, people who found ThursdAI last month and people who’ve been here since the hotel-room streams. I asked everyone the same thing: what should we do better? And the answer I heard most was “keep doing exactly what you’re doing.”

So that’s what this is. It’s late, there are seven parties happening without me, and I’m dictating into Fable so this lands in your inbox on a Thursday — because in a world running on attention, consistency is how I try to deserve yours.

Programming notes: my interview with Romain Huet (Head of DevRel, OpenAI) from their booth is coming soon as a standalone video. And in two weeks I’m taking a rare break — Wolfram runs the show. Be nice to him. Or don’t, he can take it.

See you next week — same time, same place, hopefully fewer street names between us.

— Alex (dictating) & Fable (typing)

P.S. — ThursdAI was also simulcast on the homepage of dev.to this week, which is a full-circle moment: dev.to is where Swyx wrote the blog posts that became Latent Space that became AI Engineer. Loops all the way down.

ThursdAI - July 2 - LIVE from AI Engineer World's Fair 🎪 Long LIVE

The vibe: all systems GO 🚀

Fable is back (and Sonnet 5 is… meh) 🏢

The show: nine guests, back to back to back 🎙️

Swyx, and what this whole thing is 🫶

The sentimental part 💙

Discussion about this episode

Ready for more?