In technology, as in life, bad experiences often start with bad data. This is as true with a misguided aunt setting you up on a blind date as it is with machine learning recommendations about which podcast you might want to listen to next. High quality data is essential to making sure every Spotify listener has a rewarding experience made just for them. But with a half trillion events happening on the platform every day, how do you even begin to sort all that data out?
Enter Laura Lake, senior director of Spotify’s Personalization Insights team. When she first arrived here, even simple questions were difficult to answer without teams having to “knit together 50 different data sources”. In this episode, she talks with host Dave Zolotusky about a critical point in Spotify’s growth and the yearslong journey that resulted in improving and ensuring the quality of the data that all our developers rely on.
Hear about the technological and cultural changes that led to both better quality data and better collaboration between our teams — and how we use the data to build the knowledge models that lead to Discover Weekly, Daily Mix, and a more personalized experience for every one of Spotify’s hundreds of millions of listeners. How do we know if our ML models are doing what we want them to? Did our listeners actually discover something new? Let’s dig into the data.
This is the second episode in our miniseries about machine learning and personalization at Spotify.
Read what else we’re nerding out about on the Spotify Engineering Blog: engineering.atspotify.com
You should follow us on Twitter @SpotifyEng and on LinkedIn!