Hey! Alex here, with another weekly AI update!
It seems like ThursdAI is taking a new direction, as this is our 3rd show this year, and a 3rd deep dive into topics (previously Ralph, Agent Skills), please let me know if the comments if you like this format.
This week’s deep dive is into Clawdbot, a personal AI assistant you install on your computer, but can control through your phone, has access to your files, is able to write code, help organize your life, but most importantly, it can self improve. Seeing Wolfred (my Clawdbot) learn to transcribe incoming voice messages blew my mind, and I wanted to share this one with you at length! We had Dan Peguine on the show for the deep dive + both Wolfram and Yam are avid users! This one is not to be missed. If ThursdAI is usually too technical for you, use Claude, and install Clawdbot after you read/listen to the deep dive!
Also this week, we read Claude’s Constitution that Anthropic released, heard a bunch of new TTS models (some are open source and very impressive) and talked about the new lightspeed coding model GLM 4.7 Flash. First the news, then deep dive, lets go 👇
Open Source AI
Z.ai’s GLM‑4.7‑Flash is the Local Agent Sweet Spot (X, HF)
This was the open‑source release that mattered this week. Z.ai (formerly Zhipu) shipped GLM‑4.7‑Flash, a 30B MoE model with only 3B active parameters per token, which makes it much more efficient for local agent work. We’re talking a model you can run on consumer hardware that still hits 59% on SWE‑bench Verified, which is uncomfortably close to frontier coding performance. In real terms, it starts to feel like “Sonnet‑level agentic ability, but local.” I know I know, we keep saying “sonnet at home” at different open source models, but this one slaps!
Nisten was getting around 120 tokens/sec on an M3 Ultra Mac Studio using MLX, and that’s kind of the headline. The model is fast and capable enough that local agent loops like RALPH suddenly feel practical. It also performs well on browser‑style agent tasks, which is exactly what you want for local automation without sending all your data to a cloud provider.
Liquid AI’s LFM2.5‑1.2B Thinking is the “Tiny but Capable” Class (X, HF)
Liquid AI released a 1.2B reasoning model that runs under 900MB of memory while still manages to be useful. This thing is built for edge devices and old phones, and the speed numbers are backing it up. We’re talking 239 tok/s decode on AMD CPU, 82 tok/s on mobile NPU, and prefill speeds that make long prompts actually usable. Nisten made a great point: on iOS, there’s a per‑process memory limit around 3.8GB, so a 1.2B model lets you spend your budget on context instead of weights.
This is the third class of models we’re now living with: not Claude‑scale, not “local workstation,” but “tiny agent in your pocket.” It’s not going to win big benchmarks, but it’s perfect for on‑device workflows, lightweight assistants, and local RAG.
Voice & Audio: Text To Speech is hot this week with 3 releases!
We tested three major voice releases this week, and I’m not exaggerating when I say the latency wars are now fully on.
Qwen3‑TTS: Open Source, 97ms Latency, Voice Cloning (X, HF)
Just 30 minutes before the show, Qwen released their first model of the year, Qwen3 TTS, with two models (0.6B and 1.7B). With support for Voice Cloning based on just 3 seconds of voice, and claims of 97MS latency, this apache 2.0 release looked very good on the surface!
The demos we did on stage though... were lackluster. TTS models like Kokoro previously impressed us with super tiny sizes and decent voice, while Qwen3 didn’t really perform on the cloning aspect. For some reason (I tested in Russian which they claim to support) the cloned voice kept repeating the provided sample voice instead of just generating the text I gave it. This confused me, and I’m hoping this is just a demo issue, not a problem with the model. They also support voice design where you just type in the type of voice you want, which to be fair, worked fairly well in our tests!
With Apache 2.0 and a full finetuning capability, this is a great release for sure, kudos to the Qwen team! Looking forward to see what folks do with this properly.
FlashLabs Chroma 1.0: Real-Time Speech-to-Speech, Open Source (X, HF)
Another big open source release in the audio category this week was Chroma 1.0 from FlashLabs, which claim to be the first speech2speech model (not a model that has the traditional ASR>LLM>TTS pipeline) and the claim 150ms end to end latency!
The issue with this one is, the company released an open source 4B model, and claimed that this model powers their chat interface demo on the web, but in the release notes they claim the model is english speaking only, while on the website it sounds incredible and I spoke to it in other languages 🤔 I think the mode that we’ve tested is not the open source one. I could’t confirm this at the time of writing, will follow on X with the team and let you guys know.
Inworld AI launches TTS-1.5: #1 ranked text-to-speech with sub-250ms latency at half a cent per minute (X, Announcement)
Ok this one is definitely in the realm of “voice realistic enough you won’t be able to tell” as this is not an open source model, it’s a new competitor to 11labs and MiniMax - the two leading TTS providers out there.
Inworld claims to achieve better results on the TTS Arena, while being significantly cheaper and faster (up to 25x less than leading providers like 11labs)
We tested out their voices and they sounded incredible, replied fast and generally was a very good experience. With 130ms response time for their mini version, this is a very decent new entry into the world of TTS providers.
Big Companies: Ads in ChatGPT + Claude Constitution
OpenAI is testing ads in ChatGPT’s free and Go tiers. Ads appear as labeled “Sponsored” content below responses, and OpenAI claim they won’t affect outputs. It’s still a major shift in the product’s business model, and it’s going to shape how people perceive trust in these systems. I don’t love ads, but I understand the economics, they have to make money somehow, with 900M weekly active users, many of them on the free tier, they are bound to make some money with this move. I just hope they won’t turn into a greedy ad optimizing AI machine.
Meanwhile, Anthropic released an 80‑page “New Constitution for Claude” that they use during training. This isn’t a prompt, it’s a full set of values baked into the model’s behavior. There’s a fascinating section where they explicitly talk about Claude’s potential wellbeing and how they want to support it. It’s both thoughtful and a little existential. I recommend reading it, especially if you care about alignment and agent design.
I applaud Anthropic for releasing this with Creative Commons license for public scrutiny and adoption 👏
This weeks buzz - come join the hackathon I’m hosting Jan 31 in SF
Quick plug, we have limited seats left open for the hackathon I’m hosting for Weights & Biases at the SF office, and if you’re reading this, and want to join, I’ll approve you if you mention ThursdAI in the application!
With sponsors like Redis, Vercel, BrowserBase, Daily, Google Cloud, we are going to give out a LOT of cash as prizes!
I’ve also invited a bunch of my friends from the top agentic AI places to be judges, it’s going to be awesome, come
Deep dive into Clawdbot: Local-First, Self-Improving, and Way Too Capable agent
Clawdbot (C‑L‑A‑W‑D) is that rare project where the hype is justified. It’s an open-source personal agent that runs locally on your Mac, but can talk to you through WhatsApp, Telegram, iMessage, Discord, Slack — basically wherever you already talk. What makes it different is not just the integrations; it’s the self‑improvement loop. You can literally tell it “go build a new skill,” and it will… build the skill, install it, then adopt it and start using it. It’s kind of wild to see it working for the first time. Now... it’s definitely not perfect, far far away from the polish of ChatGPT / Claude, but when it works, damn, it really is mindblowing.
That part actually happened live in the episode. Dan Peguine 🐧 showed how he had it create a skill to anonymize his own data so he could demo it on stream without leaking his personal life. Another example: I told my Clawdbot to handle voice notes in Telegram. It didn’t know how, so it went and found a transcription method, wrote itself a skill, saved it, and from that point on just… did the thing. That was the moment it clicked for me. (just before posting this, it forgot how to do it, I think I screwed something up)
Dan’s daily brief setup was wild too. It pulls from Apple Health, local calendars, weather, and his own projects, then produces a clean, human daily brief. It also lets him set reminders through WhatsApp and even makes its own decisions about how much to bother him based on context. He shared a moment where it literally told him, “I won’t bug you today because it’s your wife’s birthday.” That isn’t a hardcoded workflow — it’s reasoning layered on top of persistent memory.
And that persistent memory is a big deal. It’s stored locally as Markdown files and folders, Obsidian‑style, so you don’t lose your life every time you switch models. You can route the brain to Claude Opus 4.5 today and a local model tomorrow, and the memory stays with you. That is a huge step up from “ChatGPT remembers you unless you unsubscribe.”
There’s also a strong community forming around shared skills via ClawdHub. People are building everything from GA4 analytics skills to app testing automations to Tesla battery status checkers. The core pattern is simple but powerful: talk to it, ask it to build a skill, then it can run that skill forever.
I definitely have some issues with the security aspect, you are essentially giving full access to an LLM to your machine, so many folks are buying a specific home for their ClawdBot (Mac Mini seems to be the best option for many of them) and are giving it secure access to passwords via a dedicated 1Password vault. I’ll keep you up to date about my endeavors with Clawd but definitely do give it a try!
Installing
Installing Clawd on your machine is simple, go to clawd.bot and follow instructions. Then find the most convenient way for you to talk to it (for me it was telegram, creating a telegram token takes 20 seconds) and then, you can take it from there with Clawdbot itself! Ask it for something to do, like clear your inbox, or set a reminder, or.. a million other things that you need for your personal life, and enjoy the discovery of what a potential ever present always on AI can do!
Other news that we didn’t have time to cover at length but you should still now about:
Overworld released an OpenSource realtime AI World model (X)
Runway finally opened up their 4.5 video model, and it has Image2video capabilities, including multiple shots image to video (X)
Vercel launches skills.sh, an “npm for AI agents skills”
Anthropic’s Claude Code VS Code Extension Hits General Availability (X)
Ok, this is it for this week folks! I’m going to play with (and try to fix.. ) my clawdbot, and suggest you give it a try. Do let me know if the deepdives are a good format!
Show notes and links:
ThursdAI - Jan 22, 2026 - TL;DR and show notes
Hosts and Guests
Alex Volkov - AI Evangelist & Weights & Biases (@altryne)
Co Hosts - @WolframRvnwlf @yampeleg @nisten @ldjconfirmed
Guest Dan Peguine ( @danpeguine )
DeepDive - Clawdbot with Dan & Wolfram
Open Source LLMs
Z.ai releases GLM-4.7-Flash, a 30B parameter MoE model that sets a new standard for lightweight local AI assistants (X, Technical Blog, HuggingFace)
Liquid AI releases LFM2.5-1.2B-Thinking, a 1.2B parameter reasoning model that runs entirely on-device with under 900MB memory (X, HF, Announcement)
Sakana AI introduces RePo, a new way for language models to dynamically reorganize their context for better attention (X, Paper, Website)
Big CO LLMs + APIs
OpenAI announces testing ads in ChatGPT free and Go tiers, prioritizing user trust and transparency (X)
Anthropic publishes new 80-page constitution for Claude, shifting from rigid rules to explanatory principles that teach AI ‘why’ rather than ‘what’ to do (X, Blog, Announcement)
This weeks Buzz
WandB hackathon Weavehacks 3 - Jan 31-Feb1 in SF - limited seats available lu.ma/weavehacks3
Vision & Video
Overworld Releases Waypoint-1: Real-Time AI World Model Running at 60fps on Consumer GPUs (X, Announcement)
Voice & Audio
Alibaba Qwen Releases Qwen3-TTS: Full Open-Source TTS Family with 97ms Latency, Voice Cloning, and 10-Language Support (X, H, F, G, i, t, H, u, b)
FlashLabs Releases Chroma 1.0: World’s First Open-Source Real-Time Speech-to-Speech Model with Voice Cloning Under 150ms Latency (X, HF, Arxiv)
Inworld AI launches TTS-1.5: #1 ranked text-to-speech with sub-250ms latency at half a cent per minute (X, Announcement)
Tools
Vercel launches skills.sh, an “npm for AI agents” that hit 20K installs within hours (X, Vercel Changelog, GitHub)
Anthropic’s Claude Code VS Code Extension Hits General Availability, Bringing Full Agentic Coding to the IDE (X, VS Code Marketplace, Docs)






















