Hey, welcome to yet another ThursdAI 🫡
This episode is special for several reasons, one of which, I shared a personal life update (got to listen to the episode to hear 😉) but also, this is the first time I took the mountainous challenge of fixing, editing and “video-fying” (is that a word?) our whole live recording! All 3 hours of it, were condensed, sliced, sound improved (x audio quality is really dogshit) and uploaded for your convenience. Please let me know what you think!
TL;DR of all topics covered
Open Source LLM
Big Co LLMs + API updates
Nothing major this week
Voice & Audio
Stable Audio 🎶 - A new music generation model from Stability AI. (Website)
Coqui XTTS - an open source multilingual text to speech for training and generating a cloned voice (Github, HuggingFace)
AI Art & Diffusion
Würstchen v2 - A new super quick 1024 diffusion model (Announcement, Demo, Github)
DiffBIR - Towards Blind Image Restoration with Generative Diffusion Prior (Annoucement, Demo, Github)
Tools
Nougat from Meta - open-source OCR model that accurately scans books with heavy math/scientific notations (Announcement, Github, Paper)
GPT4All Vulkan from Nomic - Run LLMs on ANY consumer GPUs, not just NVIDIA (Announcement)
Nisten’s AI ISO disk - Announcement
And here are timestamps and chapter/discussion topics for your convenience:
[00:05:56] Phi 1.5 - 1.3B parameter model that closely matches Falcon & LLaMa 7B
[00:09:08] Potential Data Contamination with Phi 1.5
[00:10:11] Data Contamination unconfirmed
[00:12:59] Tiny models are all the rage lately
[00:16:23] Synthetic Dataset for Phi
[00:18:37] Are we going to run out of training data?
[00:20:31] Breaking News - Nougat - OCR from Meta
[00:23:12] Nisten - AI ISO disk
[00:29:08] Baichuan 7B - an immaculate Chinese model
[00:36:16] Unique Loss Terms
[00:38:37] Baichuan ByLingual and MultiLingual dataset
[00:39:30] Finetunes of Baichuan
[00:42:28] Philosophical questions in the dataset
[00:45:21] Let's think step by step
[00:48:17] Is breath related text in the original dataset?
[00:50:27] Counterintuitive prompting for models with no breath
[00:55:36] Idea spaces
[00:59:59] Alex - Life update about ThursdAI
[01:04:30] Stable Audio from Stability AI
[01:17:23] GPT4ALL Vulkan
[01:19:37] Coqui.ai releases XTTS - an open source TTS - interview With Josh Meyer
[01:30:40] Summary
Here’s a full video of the pod, and a full transcription, and as always, 🧡 thank you for bring a paid subscriber, this really gives me the energy to keep going, get better guests, release dope podcast content, and have 3 hours spaces and then spend 7 hours editing 🔥
Listen to this episode with a 7-day free trial
Subscribe to ThursdAI - Recaps of the most high signal AI weekly spaces to listen to this post and get 7 days of free access to the full post archives.