Podcast Banner

Podcasts

Paul, Weiss Waking Up With AI

Decentralizing AI

This week on “Waking Up With AI,” Anna Gressel looks at how decentralized AI training could revolutionize the field by allowing for the collaborative use of advanced GPUs worldwide, expanding access to model development while raising interesting questions about export controls and regulatory frameworks.

Stream here or subscribe on your
preferred podcast app:

Episode Transcript

Anna Gressel: Good morning, everyone, and welcome to another episode of “Waking Up With AI,” a Paul, Weiss podcast. I'm Anna Gressel, and today it is just me, which is both maybe a blessing and a curse for everyone on the line, because we're going to take on a somewhat technical topic. And I know for our regular listeners, you're probably going to be sitting around, waiting for Katherine to poke fun at me for using too much jargon. So you'll just have to imagine her on the line, laughing at our highly technical concepts today. But I think it's a topic that is really interesting and worth exploring for a moment, which is on decentralized AI training. And we'll talk about what that means. But to take a step back, why are we even talking about training? And then, why are we even talking about this alternative method of training is really the question of the day.

But to ground ourselves for a moment, it's really a common refrain in AI circles, that model progress happens on three different fronts. And those three fronts are compute, data and algorithms. And we've taken this on in the past. We've talked about these pillars of AI model development on prior podcast episodes. So I'm not going to get into a huge amount of detail on how each of those plays into model development, but we're happy to do that if someone thinks we've under-covered that topic. But each of these pillars is, in a sense, critical to AI training. But also, as we've discussed with the DeepSeek model in particular — which was this really big model release that raised a lot of questions around the AI training space — there are really important issues at play and questions at play about where exactly the competitive edge is going to come from for future AI development. And everyone's looking for that competitive edge now. Everyone's wondering, like, what is going to give the most advanced models the most competitive undergirding and the most competitive set of levers? And so it's possible, when you think about that, that some models are going to outcompete, for example, because they have a powerful, particularly powerful model architecture. And we see this a little bit with some of the advances in Mixture of Experts models. They're able to do things that are very advanced, even though maybe they have fewer parameters than other large models that aren't Mixture of Experts models. There are other models that are going to be particularly competitive because of just the volume of training data and the amount of compute that is required to train those. Those are those really, really, really high parameter models. And right now, all of these different levers, the compute, the data, the algorithms, they're all really important. And we're seeing companies experiment with how to move those levers to kind of get the best outcomes, maybe at sometimes lower cost or a faster development life cycle or speed. And so there's a lot of focus on this right now, like how to actually come up with the most competitive, most advanced models.