Sveriges mest populära poddar

How AI Is Built

Multimodal AI, Storing 1 Billion Vectors, Building Data Infrastructure | ep 1

34 min • 5 april 2024

Imagine a world where data bottlenecks, slow data loaders, or memory issues on the VM don't hold back machine learning.

Machine learning and AI success depends on the speed you can iterate. LanceDB is here to to enable fast experiments on top of terabytes of unstructured data. It is the database for AI. Dive with us into how LanceDB was built, what went into the decision to use Rust as the main implementation language, the potential of AI on top of LanceDB, and more.

"LanceDB is the database for AI...to manage their data, to do a performant billion scale vector search."

“We're big believers in the composable data systems vision."

"You can insert data into LanceDB using Panda's data frames...to sort of really large 'embed the internet' kind of workflows."

"We wanted to create a new generation of data infrastructure that makes their [AI engineers] lives a lot easier."

"LanceDB offers up to 1,000 times faster performance than Parquet."

Change She:

LanceDB:

Nicolay Gerold:

Chapters:

00:00 Introduction to LanceDB

02:16 Building LanceDB in Rust

12:10 Optimizing Data Infrastructure

26:20 Surprising Use Cases for LanceDB

32:01 The Future of LanceDB

LanceDB, AI, database, Rust, multimodal AI, data infrastructure, embeddings, images, performance, Parquet, machine learning, model database, function registries, agents.

Kategorier
Förekommer på
00:00 -00:00