Start / Deep Papers / A deep dive into generatives newest models gemini vs mistral mixtral 8x7b part i

A Deep Dive Into Generative's Newest Models: Gemini vs Mistral (Mixtral-8x7B)–Part I

48 min • 27 december 2023

For the last paper read of the year, Arize CPO & Co-Founder, Aparna Dhinakaran, is joined by a Dat Ngo (ML Solutions Architect) and Aman Khan (Product Manager) for an exploration of the new kids on the block: Gemini and Mixtral-8x7B.

There's a lot to cover, so this week's paper read is Part I in a series about Mixtral and Gemini. In Part I, we provide some background and context for Mixtral 8x7B from Mistral AI, a high-quality sparse mixture of experts model (SMoE) that outperforms Llama 2 70B on most benchmarks with 6x faster inference Mixtral also matches or outperforms GPT3.5 on most benchmarks. This open-source model was optimized through supervised fine-tuning and direct preference optimization.

Stay tuned for Part II in January, where we'll build on this conversation in and discuss Gemini-developed by teams at DeepMind and Google Research.

Link to transcript and live recording: https://arize.com/blog/a-deep-dive-into-generatives-newest-models-mistral-mixtral-8x7b/

Learn more about AI observability and evaluation, join the Arize AI Slack community or get the latest on LinkedIn and X.

Kategorier

Matematik Poddar Teknologi Vetenskap

Förekommer på

Vetenskap

00:00 -00:00