ThursdAI - Recaps of the most high signal AI weekly spaces
ThursdAI - The top AI news from the past week
ThursdAI Aug 17 - AI Vision, Platypus tops the charts, AI Towns, Self Alignment 📰 and a special interview with Platypus authors!
2
0:00
-16:54

ThursdAI Aug 17 - AI Vision, Platypus tops the charts, AI Towns, Self Alignment 📰 and a special interview with Platypus authors!

Also, talking ThursdAI more seriously, we have a few updates of our own!
2

Hey everyone, this is Alex Volkov, the host of ThursdAI, welcome to yet another recap of yet another incredibly fast past faced week.

Hey if you’re new to ThursdAi and want to get updates, please subscribe, and if you’re already a subscriber, consider supporting 🫶

ThursdAI housekeeping

I want to start with a ThursdAI update, we now have a new website Thursdai.news and a new dedicated twitter account @thursdai_pod as we build up the ThursdAI community and brand a bit more.

As always, a reminder that ThursdAI is a weekly X space, newsletter and 2! podcasts, short form (Apple, Spotify) and the unedited long-form spaces recordings (RSS, Zealous page) for those who’d like the nitty gritty details (and are on a long drive somewhere).

Open Source LLMs & Finetuning

Honestly, the speed with which LLaMa 2 finetunes are taking over state of the art performance is staggering. We literally talk about a new model every week that’s topping the LLM Benchmark leaderboard, and it hasn’t even been a month since LLaMa 2 release day 🤯 (July 18 for those who are counting)

Enter Platypus 70B (🔗)

Image

Platypus 70B-instruct is currently the highest ranked open source LLM1 and other Platypus versions

We’ve had the great pleasure to chat with new friends of the pod Arielle Lee and Cole Hunter (and long time friend of the pod Nataniel Ruiz, co-author of DreamBooth, and StyleDrop which we’ve covered before) about this incredible effort to finetune LLaMa 2, the open dataset they curated and released as part of this effort and how quick and easy it is possible to train (a smaller 13B) version of Platypus (just 5 hours on a single A100 GPU ~= 6$ on Lambda 🤯)

We had a great interview with Garage BAIND the authors of Platypus and we’ll be posting that on a special Sunday episode of ThursdAI so make sure you are subscribed to receive that when it drops. It’s here!!

Open Orca + Platypus = OrctyPus 13B? (🔗)

We’ve told you about OpenOrca just last week, from our friends at @alignment_lab and not only is Platypus is the best performing 70B model, the open source community comes through with an incredible merge and collaborating to bring you the best 13B model, which is a merge between OpenOrca and Platypus.

This 13B model is now very close to the original LLaMa 70B in many of the metrics. LESS THAN A MONTH after the initial open source. It’s quite a remarkable achievement and we salute the whole community for this immense effort 👏 Also, accelerate! 🔥

HuggingFace Leaderboard table. Showing OpenOrca-Platypus2-13B above llama-65b and all other 13B models. Highlighting OpenOrcaxOpenChat-Preview2-13B and Platypus2-13B as the models that it was merged from.

Join the skunksworks

Speaking of fast moving things, In addition to the above interview, we had a great conversation with folks from so called SkunksWorks OS discord, Namely Far El, Prateek Yadav, Alpay Ariak, Teknium and Alignment Labs, and our recurring guest hosts Yam Peleg and Nisten covered two very exciting community efforts, all happening within the SkunksWorks Discord.

First effort is called MoE, Open mixture of experts, which is an Open Source attempt at replicating the Mixture of Experts model, which is widely attributed to why GPT-4 is so much better than GPT-3.

The second effort is called Ablation studies, which is an effort Teknium is leading to understand once and for all, what is the best, cheapest and most high quality way to finetune open source models, whether it's Qlora or a full finetune or Loras.

If you're interested in any of these, either by helping directly or provide resources such as GPU compute, please join the SkunksWorks discord. They will show you how to participate, even if you don't have prior finetuning knowledge! And we’ll keep you apprised of the results once they release any updates!

Big Co LLMs + API updates

In our Big CO corner, we start with an incredible paper from MetaAi, announcing:

Self-Alignment w/ Backtranslation method + Humpback LLM - MetaAI

Summarized briefly (definitely listen to the full episode and @yampeleg detailed overview of this method) it’s a way for an LLM to be trained on a unsupervised way of creating high quality datasets, for itself! Using not a lot of initial “seed” data from a high quality dataset. Think of it this way, fine-tuning a model requires a lot of “question → response” data in your dataset, and back-translation proposes “response → question” dataset generation, coming up with novel ways of saying “what would a potential instruction be that would make an LLM generate this result”

This results in a model that effectively learns to learn better and create it’s own datasets without humans (well at least human labelers) in the loop.

Here are some more reading material on X for reference.

OpenAI new JS SDK (X link)

OpenAI has partnered with StainlessAPI to released a major new version 4 of their TS/JS SDK with the following incredible DX improvements for AI engineers

  • Streaming responses for chat & completions

  • Carefully crafted TypeScript types

  • Support for ESM, Vercel edge functions, Cloudflare workers, & Deno

  • Better file upload API for Whisper, fine-tune files, & DALL·E images

  • Improved error handling through automatic retries & error classes

  • Increased performance via TCP connection reuse

  • Simpler initialization logic

The most exciting part for me is, this is now very easy to get started with AI projects and get streaming on the incredible Cloudflare workers platform (Targum is part of the first Cloudflare workers launchpad but is not affiliated, we’re just superfans 🫶)

Vision & Multi Modality

There’s been some really cool stuff happening in computer vision and multi-modal AI recently. First up, a new method called 3D Gaussian Splatting that shows an incredibly clear and smooth way to generate 3d scenes from just a few images.

Compared to neural radiance fields (NeRFs), Gaussian splatting produces much smoother results without the grainy voxel artifacts NeRFs often have. However, it achieves this improved quality without sacrificing the speed and performance of NeRFs. So Gaussian splatting gives a big boost in realism compared to NeRF renderings, while maintaining real-time speeds in cleaning up those “clouds”

Supervision from Roboflow (and Piotr)

Btw our own friend of the pod and AI Vision expert @skalskiP (who reviewed Gaussian Splatting for us) is also having a crazy ThursdAI week, with their open source library called SuperVision, which is a computer vision toolkit, and is trending #2 on Github 👏

Apple stepping in their Vision (not the headset) Transformer game

Apple has open sourced ml-fastvit, which is their general purpose Vision Transformers model, which they claim runs at ~1ms on mobile devices, including code and pre-train weights available on Github 🔥

This is great to see from Apple ML teams, not only them open sourcing, but also them preparing all of us to the world of spatial computers (Vision Pro coming remember?) and many new Computer Vision heavy apps will be available at those incredible speeds.

This is also great for on device inference running these models in node / on edge (as Friend of the pod @visheratin demonstrated with WebAI)

Additional updates included Nvidia releasing a web playground for NeVa, which is their MLLM (Multimodal LLM, get used to seeing this term everywhere) and you can play with that here ), and Link-Context learning for MLLMs

Agents

OpenAi is also announced that Global Illumination joining OpenAI, that team is CEOd by the creator of Instagram stories algorithm and feed contributor and the team is behind a massive open world minecraft clone. Will we see OpenAI release agents into that world? We know that they are working on agents

A16Z - AI Town (🔗)

Speaking of agents roaming free and interacting, we covered the open sourcing of SmallVille just last week ↴ and now we see a new open source framework called AI Town of running letting agents roam and interact with each other from Andreessen Horowitz AI division.

AI Town (Github) is a web framework, written in TypeScript and is built to run, get customized and run with different LLMs (even Open source ones) in mind and you can see the AI agents running around in a live demo here


This ThursdAI was so packed with great information, that it’s really worth listening to the whole recording, you can do this on our Zealous page, RSS and on twitter (all those links can always be found on thursdai.news )

If you found this valubale, join our community and let your friends know? This is a great way to support us, as well as participate in the discussion on social, tag #thursdAI on anything you feel is worthwhile for us to summarize and

Refer a friend

Also, please answer this quick poll:

Loading...

Discussion about this podcast

ThursdAI - Recaps of the most high signal AI weekly spaces
ThursdAI - The top AI news from the past week
Every ThursdAI, Alex Volkov hosts a panel of experts, ai engineers, data scientists and prompt spellcasters on twitter spaces, as we discuss everything major and important that happened in the world of AI for the past week.
Topics include LLMs, Open source, New capabilities, OpenAI, competitors in AI space, new LLM models, AI art and diffusion aspects and much more.