How do you get a machine to find a song that’s similar to another song? What properties of the song should it look for? And then does it just compare each track to every other track, one by one, until it finds the closest match? When you have a catalog of 100 million different music tracks, like we do at Spotify, that would take a long time. So, for these kinds of problems, we use a technique known as nearest neighbor search (NNS). This past summer at Spotify, we built a new library for nearest neighbor search: It’s called Voyager — and we open sourced it.
Host and principal engineer Dave Zolotusky talks with Peter Sobot and Mark Koh, two of the machine learning engineers who developed Voyager. They discuss using nearest neighbor search for recommendations and personalization, how to go from searching for vectors in a 2D space to searching for them in a space with thousands of dimensions, the relative funkiness and danceability of Mozart and Bach, how to find a place on a map when you don’t have the exact coordinates, tricky acronyms (Annoy: “Approximate Nearest Neighbor Oh Yeah”) and initialisms (HNSW: “Hierarchical Navigable Small World”), why we stopped using our old NNS library, why we open sourced the new one, how it works for use cases beyond music (like LLMs), and looking for ducks in grass.
Learn more about Spotify Voyager:
Read what else we’re nerding out about on the Spotify Engineering Blog: engineering.atspotify.com
You should follow us on Twitter @SpotifyEng and on LinkedIn!