Dr. Ronen Dar (Co-Founder/ CTO of @runailabs) talks about the challenges of running compute infrastructure for AI, the GPU ecosystem, sizing LLMs and more.
SHOW: 739
CLOUD NEWS OF THE WEEK - http://bit.ly/cloudcast-cnotw
NEW TO CLOUD? CHECK OUT - "CLOUDCAST BASICS"
SHOW SPONSORS:
SHOW NOTES:
Topic 1 - Welcome to the show. Tell us a little bit about your background, and what you focus on at Run:ai.
Topic 2 - Let’s begin by talking about the challenges of running AI applications. What unique characteristics and requirements do AI applications have?
Topic 3 - Most AI applications run on GPUs. How do things change when using GPUs vs. CPUs to power AI applications? What is needed to get the most out of GPUs?
Topic 4 - As environments grow larger, what is needed to scale-up environments, both in terms of scheduling applications and managing the underlying GPU infrastructure?
Topic 5 - GPUs are not only expensive resources, but also in high-demand. How are companies doing capacity planning with GPUs? What struggles are you seeing companies have as they manage planning for AI projects?
Topic 6 - Are the new Large Language Models (LLMs) much different in size than AI models of the past?
Topic 7 - How well is the industry prepared to deal with the new interest in AI from across the industry?
FEEDBACK?