Today’s guest is Antonio Bustamante, a serial entrepreneur who previously built Kite and Silo and is now working to fix bad data. He is building bem, the data tool to transform any data into the schema your AI and software needs.
bem.ai is a data tool that focuses on transforming any data into the schema needed for AI and software. It acts as a system's interoperability layer, allowing systems that couldn't communicate before to exchange information. Learn what place LLMs play in data transformation, how to build reliable data infrastructure and more.
"Surprisingly, the hardest was semi-structured data. That is data that should be structured, but is unreliable, undocumented, hard to work with."
"We were spending close to four or five million dollars a year just in integrations, which is no small budget for a company that size. So I was pretty much determined to fix this problem once and for all."
"bem focuses on being the system's interoperability layer."
"We basically take in anything you send us, we transform it exactly into your internal data schema so that you don't have to parse, process, transform anything of that sort."
"LLMs are a 30% of it... A lot of it is very, very like thorough validation layers, great infrastructure, just ensuring reliability and connection to our user systems.”
"You can use a million token context window and feed an entire document to an LLM. I can guarantee you if you don't, semantically chunk it out before you're not going to get the right results.”
"We're obsessed with time to value... Our milestone is basically five minute onboarding max, and then you're ready to go."
Antonio Bustamante
Nicolay Gerold:
Semi-structured data, Data integrations, Large language models (LLMs), Data transformation, Schema interoperability, Fault tolerance, Validation layers, System reliability, Schema evolution, Enterprise software, Data pipelines.
Chapters
00:00 The Problem of Integrations
05:58 Building Fault Tolerant Systems
13:51 Versioning and Semantic Validation
27:33 BEM in the Data Ecosystem
34:40 Future Plans and Onboarding