Hey! Welcome back.
Today we look at how we can get our RAG system ready for scale.
We discuss common problems and their solutions, when you introduce more users and more requests to your system.
For this we are joined by Nirant Kasliwal, the author of fastembed.
Nirant shares practical insights on metadata extraction, evaluation strategies, and emerging technologies like Colipali. This episode is a must-listen for anyone looking to level up their RAG implementations.
"Naive RAG has a lot of problems on the retrieval end and then there's a lot of problems on how LLMs look at these data points as well."
"The first 30 to 50% of gains are relatively quick. The rest 50% takes forever."
"You do not want to give the same answer about company's history to the co-founding CEO and the intern who has just joined."
"Embedding similarity is the signal on which you want to build your entire search is just not quite complete."
Key insights:
Nirant Kasliwal:
Nicolay Gerold:
query understanding, AI-powered search, Lambda Mart, e-commerce ranking, networking, experts, recommendation, search