Word embeddings might feel like they are a little bit out of fashion. After all, we have attention mechanisms and transformer models now, right? Well, it turns out that if you apply distillation the right way you can actually get highly performant word embeddings out. It's a technique featured by the model2vec project from the Minish lab and in this episode we talk to the founder to learn more about the technique.
We have a Discord these days, feel free to discuss the podcast with us there! https://discord.probabl.ai
This podcast is part of the open efforts over at probabl.
To learn more you can check out website or reach out to us on social media.
Website: https://probabl.ai/
Bluesky: https://bsky.app/profile/probabl.bsky.social
LinkedIn: https://www.linkedin.com/company/probabl
Twitter: https://x.com/probabl_ai
#probabl