258 avsnitt • Längd: 40 min • Veckovis: Torsdag
A series of informal conversations with thought leaders, researchers, practitioners, and writers on a wide range of topics in technology, science, and of course big data, data science, artificial intelligence, and related applications. Anchored by Ben Lorica (@BigData), the Data Exchange also features a roundup of the most important stories from the worlds of data, machine learning and AI. Detailed show notes for each episode can be found on https://thedataexchange.media/ The Data Exchange podcast is a production of Gradient Flow [https://gradientflow.com/].
The podcast The Data Exchange with Ben Lorica is created by Ben Lorica. The podcast and the artwork on this page are embedded on this page using the public podcast feed (RSS).
Vasant Dhar is a Professor at the Stern School of Business and the Center for Data Science at NYU. He’s one of the creators of the Damodaran Bot, an AI-powered system designed to emulate the valuation analysis and investment insights of renowned finance professor Aswath Damodaran. This episode explores the transformative impact of AI in finance, covering applications such as generative AI, AI-powered valuation bots, systematic investing, and narrative analysis. It delves into the development of an AI valuation bot, discussing motivations, technical approaches, and challenges.
Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/
Subscribe: Apple • Spotify • Overcast • Pocket Casts • AntennaPod • Podcast Addict • Amazon • RSS.
Detailed show notes - with links to many references - can be found on The Data Exchange web site.
Vaibhav Gupta is the CEO and co-founder of Boundary. In this episode, we explore BAML, an open source domain-specific language designed to streamline interactions with large language models (LLMs).
Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/
Subscribe: Apple • Spotify • Overcast • Pocket Casts • AntennaPod • Podcast Addict • Amazon • RSS.
Detailed show notes - with links to many references - can be found on The Data Exchange web site.
In this conversation with Tim Persons, AI Leader at PwC, we explore the current landscape of generative AI adoption, examining how enterprises are navigating budget trends, moving from experimentation to full-scale deployment, and addressing cultural challenges along the way.
Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/
Subscribe: Apple • Spotify • Overcast • Pocket Casts • AntennaPod • Podcast Addict • Amazon • RSS.
Detailed show notes - with links to many references - can be found on The Data Exchange web site.
This is our monthly conversation on topics in AI and Technology with Paco Nathan, the founder of Derwen, a boutique consultancy focused on Data and AI.
Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/
Subscribe: Apple • Spotify • Overcast • Pocket Casts • AntennaPod • Podcast Addict • Amazon • RSS.
Detailed show notes - with links to many references - can be found on The Data Exchange web site.
Matt Welsh is a technical leader at Aryn AI, an AI-powered ETL system for RAG frameworks, LLM-based applications, and vector databases. In this episode, we explore how AI is revolutionizing programming and software development.
Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/
Subscribe: Apple • Spotify • Overcast • Pocket Casts • AntennaPod • Podcast Addict • Amazon • RSS.
Detailed show notes - with links to many references - can be found on The Data Exchange web site.
Mars Lan, Co-Founder & CTO at Metaphor1, an AI-powered social platform that enhances data governance by empowering all employees, not just data teams, to easily collaborate, search, and share insights through an intuitive, AI-driven interface.
Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/
Subscribe: Apple • Spotify • Overcast • Pocket Casts • AntennaPod • Podcast Addict • Amazon • RSS.
Detailed show notes - with links to many references - can be found on The Data Exchange web site.
Yishay Carmiel is the CEO of Meaning, a startup building real-time generative AI systems focused on voice applications.
Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/
Subscribe: Apple • Spotify • Overcast • Pocket Casts • AntennaPod • Podcast Addict • Amazon • RSS.
Detailed show notes - with links to many references - can be found on The Data Exchange web site.
Aurimas Griciūnas is the Chief Product Officer of Neptune.AI, a startup building experiment tracking tools for foundation model training.
Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/
Subscribe: Apple • Spotify • Overcast • Pocket Casts • AntennaPod • Podcast Addict • Amazon • RSS.
Detailed show notes - with links to many references - can be found on The Data Exchange web site.
This is our monthly conversation on topics in AI and Technology with Paco Nathan, the founder of Derwen, a boutique consultancy focused on Data and AI.
Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/
Subscribe: Apple • Spotify • Overcast • Pocket Casts • AntennaPod • Podcast Addict • Amazon • RSS.
Detailed show notes - with links to many references - can be found on The Data Exchange web site.
Petros Zerfos and Hima Patel of IBM Research are part of the team behind Data Prep Kit, an open-source toolkit that helps process and prepare raw text and code data at scale for use in large language model applications.
Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/
Subscribe: Apple • Spotify • Overcast • Pocket Casts • AntennaPod • Podcast Addict • Amazon • RSS.
Detailed show notes - with links to many references - can be found on The Data Exchange web site.
Dr. Andrew Ng is a globally recognized AI leader, founder of DeepLearning.AI and Landing AI, General Partner at AI Fund, Chairman and Co-Founder of Coursera, and Adjunct Professor at Stanford University.
Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/
Subscribe: Apple • Spotify • Overcast • Pocket Casts • AntennaPod • Podcast Addict • Amazon • RSS.
Detailed show notes - with links to many references - can be found on The Data Exchange web site.
Jay Dawani is CEO and founder of Lemurian Labs, a pioneering startup building a software stack for developing advanced AI systems, focusing on pushing the boundaries of computational capabilities and model performance.
Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/
Subscribe: Apple • Spotify • Overcast • Pocket Casts • AntennaPod • Podcast Addict • Amazon • RSS.
Detailed show notes - with links to many references - can be found on The Data Exchange web site.
This is our monthly conversation on topics in AI and Technology with Paco Nathan, the founder of Derwen, a boutique consultancy focused on Data and AI.
Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/
Subscribe: Apple • Spotify • Overcast • Pocket Casts • AntennaPod • Podcast Addict • Amazon • RSS.
Detailed show notes - with links to many references - can be found on The Data Exchange web site.
Evangelos Simoudis is Managing Director at Synapse Partners, a firm that assists corporations in implementing AI solutions, and invests in startups developing applications that exploit data using AI.
Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/
Subscribe: Apple • Spotify • Overcast • Pocket Casts • AntennaPod • Podcast Addict • Amazon • RSS.
Detailed show notes - with links to many references - can be found on The Data Exchange web site.
Shuveb Hussain is co-founder of Unstract, a no-code platform that uses large language models to extract structured data from unstructured documents, allowing users to build API endpoints and ETL pipelines to automate document processing workflows.
Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/
Subscribe: Apple • Spotify • Overcast • Pocket Casts • AntennaPod • Podcast Addict • Amazon • RSS.
Detailed show notes - with links to many references - can be found on The Data Exchange web site.
Alfred Spector’s distinguished career includes groundbreaking work in networked computing systems and leadership roles in research at IBM, Google, and Two Sigma Investments. He is currently a visiting scholar at MIT.
Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/
Subscribe: Apple • Spotify • Overcast • Pocket Casts • AntennaPod • Podcast Addict • Amazon • RSS.
Detailed show notes - with links to many references - can be found on The Data Exchange web site.
This is our monthly conversation on trending topics in AI and Technology with Paco Nathan, the founder of Derwen, a boutique consultancy focused on Data and AI.
Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/
Subscribe: Apple • Spotify • Overcast • Pocket Casts • AntennaPod • Podcast Addict • Amazon • RSS.
Detailed show notes - with links to many references - can be found on The Data Exchange web site.
Andrew Burt is co-founder of both Luminos.Law and Luminos.ai, entities building tools to help companies mitigate and manage AI risks. We dive into the critical topic of AI incident response, highlighting its unique challenges compared to traditional software incidents.
Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/
Subscribe: Apple • Spotify • Overcast • Pocket Casts • AntennaPod • Podcast Addict • Amazon • RSS.
Detailed show notes can be found on The Data Exchange web site.
Chang She is CEO and co-founder of LanceDB, an open-source database designed for multimodal AI applications, offering scalable vector search, streaming training data, and interactive exploration of large AI datasets. In this episode we discuss Lance, an open-source columnar data format that tackles the unique challenges posed by modern AI and machine learning workloads.
Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/
Subscribe: Apple • Spotify • Overcast • Pocket Casts • AntennaPod • Podcast Addict • Amazon • RSS.
Detailed show notes can be found on The Data Exchange web site.
Ajay Kulkarni and Mike Freedman are the co-founders of Timescale, a startup that provides an enhanced version of PostgreSQL optimized for time-series analytics, AI applications, and scalable relational workloads.
Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/
Subscribe: Apple • Spotify • Overcast • Pocket Casts • AntennaPod • Podcast Addict • Amazon • RSS.
Detailed show notes can be found on The Data Exchange web site.
Philip Rathle, CTO of Neo4j, joins the podcast to discuss the rising popularity of graph-enhanced retrieval augmented generation (GraphRAG). He also discusses the potential impact of the new GQL graph query language standard. [Link to the demo that Philip showed.]
Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/
Subscribe: Apple • Spotify • Overcast • Pocket Casts • AntennaPod • Podcast Addict • Amazon • RSS.
Detailed show notes can be found on The Data Exchange web site.
Paco Nathan is the founder of Derwen, a boutique consultancy focused on Data and AI. This episode is part of our series of monthly roundups and covers: the proposed California Senate Bill 1047 for regulating AI models, including its feasibility and potential unintended consequences. We also discuss the rising popularity of graph retrieval augmented generation (GraphRAG) techniques to mitigate hallucinations in large language models, while acknowledging the current limitations and future potential of integrating symbolic and statistical AI approaches. Additionally, we explore the concept of AI avatars in the workplace, highlighting the challenges and ethical considerations surrounding digital twins and agent-based systems.
Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/
Subscribe: Apple • Spotify • Overcast • Pocket Casts • AntennaPod • Podcast Addict • Amazon • RSS.
Detailed show notes can be found on The Data Exchange web site.
Jiwoo Hong and Noah Lee of KAIST AI are co-authors of ORPO: Monolithic Preference Optimization without Reference Model.
Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/
Subscribe: Apple • Spotify • Overcast • Pocket Casts • AntennaPod • Podcast Addict • Amazon • RSS.
Detailed show notes can be found on The Data Exchange web site.
In this episode, Pete Warden introduces his company, Useful Sensors, which focuses on developing AI solutions for consumer electronics and appliances. [This episode originally aired on Generative AI in the Real World, a podcast series I’m hosting for O’Reilly.]
Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/
Subscribe: Apple • Spotify • Overcast • Pocket Casts • AntennaPod • Podcast Addict • Amazon • RSS.
Detailed show notes can be found on The Data Exchange web site.
Ken Liu, Ph.D. student in Computer Science at Stanford, is the author of Machine Unlearning in 2024. We explore the concept of machine unlearning, a process of removing specific data points from trained AI models.
Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/
Subscribe: Apple • Spotify • Overcast • Pocket Casts • AntennaPod • Podcast Addict • Amazon • RSS.
Detailed show notes can be found on The Data Exchange web site.
Joao (Joe) Moura is the founder of crewAI, an open-source platform that simplifies the development and deployment of AI agents, allowing users to build autonomous systems for various tasks using multiple large language models.
Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/
Subscribe: Apple • Spotify • Overcast • Pocket Casts • AntennaPod • Podcast Addict • Amazon • RSS.
Detailed show notes can be found on The Data Exchange web site.
Paco Nathan is the founder of Derwen, a boutique consultancy focused on Data and AI. This episode is part of our series of monthly roundups and covers: Llama 3 and other recent LLMs, the rise of open foundation models, the evolution of AI agents, and the importance of data engineering. We also explore the limitations of leaderboards in evaluating AI models and touch upon the ethical and societal implications of AI development.
Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/
Subscribe: Apple • Spotify • Overcast • Pocket Casts • AntennaPod • Podcast Addict • Amazon • RSS.
Detailed show notes can be found on The Data Exchange web site.
Gunther Hagleither is co-founder of Waii, a startup that provides an API enabling businesses to seamlessly integrate text-to-SQL functionality into their products.
Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/
Subscribe: Apple • Spotify • Overcast • Pocket Casts • AntennaPod • Podcast Addict • Amazon • RSS.
Detailed show notes can be found on The Data Exchange web site.
In this episode we explore the latest developments in artificial intelligence with a focus on the 2024 Artificial Intelligence Index Report, edited by Nestor Maslej from Stanford’s Institute for Human-Centered Artificial Intelligence.
Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/
Subscribe: Apple • Spotify • Overcast • Pocket Casts • AntennaPod • Podcast Addict • Amazon • RSS.
Detailed show notes can be found on The Data Exchange web site.
In this episode, Hagay Lupesko, Senior Director of Engineering at Databricks MosaicAI, delves into the creation and aspirations behind DBRX, an innovative open Large Language Model (LLM) designed to bridge the gap between quality and cost-effectiveness for AI applications.
Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/
Subscribe: Apple • Spotify • Overcast • Pocket Casts • AntennaPod • Podcast Addict • Amazon • RSS.
Detailed show notes can be found on The Data Exchange web site.
Paco Nathan is the founder of Derwen, a boutique consultancy focused on Data and AI. This episode is part of our series of monthly roundups and covers: recently released large language models, Constraint-Driven Innovation, highlights from GTC 2024, and Lessons from the First AI Workload Security Exploit.
Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/
Subscribe: Apple • Spotify • Overcast • Pocket Casts • AntennaPod • Podcast Addict • Amazon • RSS.
Detailed show notes can be found on The Data Exchange web site.
Steve Pike is a co-founder of Infield.ai, a startup building tools to help companies upgrade and maintain open source software dependencies, ensuring they stay up-to-date with the latest releases, features, and security fixes.
Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/
Subscribe: Apple • Spotify • Overcast • Google • AntennaPod • Podcast Addict • Amazon • RSS.
Detailed show notes can be found on The Data Exchange web site.
Chetan Gupta is the Head of AI Research at Hitachi. This episode explores the applications and challenges of generative AI in industrial settings.
Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/
Subscribe: Apple • Spotify • Overcast • Google • AntennaPod • Podcast Addict • Amazon • RSS.
Detailed show notes can be found on The Data Exchange web site.
Semih Salihoglu is an Associate Professor at University of Waterloo, and co-creator of Kuzu an open source embeddable property graph database management system. This episode explores the use of large language models (LLMs) for generating queries across different query languages like SQL and Cypher for graphs.
Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/
Subscribe: Apple • Spotify • Overcast • Google • AntennaPod • Podcast Addict • Amazon • RSS.
Detailed show notes can be found on The Data Exchange web site.
Sadegh Riazi, CEO and co-founder of Pyte, a startup offering secure, encrypted data collaboration solutions, enabling partners to maximize insights without compromising privacy or data integrity.
Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/
Subscribe: Apple • Spotify • Overcast • Google • AntennaPod • Podcast Addict • Amazon • RSS.
Detailed show notes can be found on The Data Exchange web site.
Paco Nathan is the founder of Derwen, a boutique consultancy focused on Data and AI. This episode explores recent developments in AI, including text-to-video models like Sora, frameworks for productionizing AI models, analyses of systems like Google’s Gemini, techniques to improve foundation models, AMD’s software innovations for AI acceleration, and knowledge graph augmentations of language models.
Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/
Subscribe: Apple • Spotify • Overcast • Google • AntennaPod • Podcast Addict • Amazon • RSS.
Detailed show notes can be found on The Data Exchange web site.
Jerry Kaplan is the author of the new book “Generative Artificial Intelligence: What Everyone Needs to Know”.
Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/
Subscribe: Apple • Spotify • Overcast • Google • AntennaPod • Podcast Addict • Amazon • RSS.
Detailed show notes can be found on The Data Exchange web site.
This episode is our annual deep dive into the themes and trends of AI in 2024, emphasizing the democratization of AI hardware, advancements in generative AI models, and the integration of AI into various enterprise processes.
Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/
Subscribe: Apple • Spotify • Overcast • Google • AntennaPod • Podcast Addict • Amazon • RSS.
Detailed show notes can be found on The Data Exchange web site.
Bryan Cantrill, CTO and Co-founder of Oxide Cloud Computer, leads a startup delivering integrated hardware and software solutions for enterprises seeking cloud computing systems with hyperscaler agility. Oxide specializes in vertically integrated, scale-ready cloud infrastructure tailored for mainstream business needs.
Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/
Subscribe: Apple • Spotify • Overcast • Google • AntennaPod • Podcast Addict • Amazon • RSS.
Detailed show notes can be found on The Data Exchange web site.
Evangelos Simoudis is a seasoned venture investor and a senior advisor to global corporations, and Managing Director at Synapse Partners, a company that invests in startups developing enterprise applications that exploit Big Data and AI.
Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/
Subscribe: Apple • Spotify • Overcast • Google • AntennaPod • Podcast Addict • Amazon • RSS.
Detailed show notes can be found on The Data Exchange web site.
Sharon Zhou and Greg Diamos are co-founders of Lamini, a startup at the forefront of enabling enterprise adoption of large language models (LLMs). We discussed Lamini’s work with AMD, which focused on closing the gap between AMD hardware capabilities and software integration in LLM applications.
Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/
Subscribe: Apple • Spotify • Overcast • Google • AntennaPod • Podcast Addict • Amazon • RSS.
Detailed show notes can be found on The Data Exchange web site.
Uri Gneezy is Professor of Economics and Strategy at UC San Diego, and author of our 2023 Book of the Year, “Mixed Signals: How Incentives Really Work”.
Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/
Subscribe: Apple • Spotify • Overcast • Google • AntennaPod • Podcast Addict • Amazon • RSS.
Detailed show notes can be found on The Data Exchange web site.
Dmitriy Ryaboy is the VP of AI Enablement at Ginkgo Bioworks, a startup that uses machine learning and AI to develop a wide range of applications. The conversation focuses on the intersection of AI, machine learning, and biology, particularly in the field of synthetic biology.
Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/
Subscribe: Apple • Spotify • Overcast • Google • AntennaPod • Podcast Addict • Amazon • RSS.
Detailed show notes can be found on The Data Exchange web site.
Jian Zhang is co-founder, CTO, VP Engineering at Nexusflow AI a startup that uses Generative AI to build tools for Cybersecurity. This conversation revolves around the integration of various AI components, with a specific focus on cybersecurity and function calling copilots.
Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/
Subscribe: Apple • Spotify • Overcast • Google • AntennaPod • Podcast Addict • Amazon • RSS.
Detailed show notes can be found on The Data Exchange web site.
Sarmad Qadri, founder and CEO of LastMile, a startup building an AI developer platform for engineering teams. This conversation delves into key artificial intelligence and machine learning themes, focusing on injecting software engineering rigor into the development of LLM and GenAI applications.
Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/
Subscribe: Apple • Spotify • Overcast • Google • AntennaPod • Podcast Addict • Amazon • RSS.
Detailed show notes can be found on The Data Exchange web site.
Nir Shavit, Professor at MIT’s Computer Science and Artificial Intelligence Laboratory, is also a Founder of Neural Magic, a startup working to accelerate open-source large language models and simplify AI deployments.
Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/
Subscribe: Apple • Spotify • Overcast • Google • AntennaPod • Podcast Addict • Amazon • RSS.
Detailed show notes can be found on The Data Exchange web site.
Chirag Yagnik is a co-founder of Arta , a company that harnesses innovations in artificial intelligence and software to develop wealth management solutions. Arta aims to democratize access to sophisticated investment tools typically only available to ultra-high net worth individuals through family offices.
Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/
Subscribe: Apple • Spotify • Overcast • Google • AntennaPod • Podcast Addict • Amazon • RSS.
Detailed show notes can be found on The Data Exchange web site.
Juan Sequeda (Principal Scientist & Head of AI Lab) and Dean Allemang (Principal Solutions Architect) are knowledge graph experts at data.world, a startup that offers a data catalog powered by a knowledge graph to help organizations better understand and gain value from their data.
Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/
Subscribe: Apple • Spotify • Overcast • Google • AntennaPod • Podcast Addict • Amazon • RSS.
Detailed show notes can be found on The Data Exchange web site.
Max Mergenthaler (CEO) and Azul Garza Ramirez (CTO) are co-founders of Nixtla, a startup that seeks to make cutting-edge predictive insights widely accessible. In this episode we discuss TimeGPT, Nixtla’s new frontier model for time series forecasting.
Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/
Subscribe: Apple • Spotify • Overcast • Google • AntennaPod • Podcast Addict • Amazon • RSS.
Detailed show notes can be found on The Data Exchange web site.
Waleed Kadous, Chief Scientist at Anyscale, is one of my go-to experts for best practices on building applications leveraging large language models.
Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/
Subscribe: Apple • Spotify • Overcast • Google • AntennaPod • Podcast Addict • Amazon • RSS.
Detailed show notes can be found on The Data Exchange web site.
Kieren James-Lubin, CEO of BlockApps and the Co-Chair Technical Steering Community for the Enterprise Ethereum Alliance.
Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/
Subscribe: Apple • Spotify • Overcast • Google • AntennaPod • Podcast Addict • Amazon • RSS.
Detailed show notes can be found on The Data Exchange web site.
Earlier this year, I had a conversation with Sam Ramji, Chief Strategy Officer at DataStax and host of the Open||Source||Data podcast, where we talked about the evolution of big data and AI technologies. I’m airing our original conversation in its entirety on this holiday weekend in the U.S.
Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/
Subscribe: Apple • Spotify • Overcast • Google • AntennaPod • Podcast Addict • Amazon • RSS.
Detailed show notes can be found on The Data Exchange web site.
Malte Pietsch is co-founder & CTO of Deepset, the company behind the popular open source project Haystack, an orchestration framework for LLMs.
Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/
Subscribe: Apple • Spotify • Overcast • Google • AntennaPod • Podcast Addict • Amazon • RSS.
Detailed show notes can be found on The Data Exchange web site.
In this episode, Paco Nathan and I dive into insights from the inaugural AI Conference in San Francisco (video of talks can be found here).
Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/
Subscribe: Apple • Spotify • Overcast • Google • AntennaPod • Podcast Addict • Amazon • RSS.
Detailed show notes can be found on The Data Exchange web site.
Semih Salihoglu is an Associate Professor at University of Waterloo, and co-creator of Kuzu an open source embeddable property graph database management system.
Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/
Subscribe: Apple • Spotify • Overcast • Google • AntennaPod • Podcast Addict • Amazon • RSS.
Detailed show notes can be found on The Data Exchange web site.
Philipp Moritz (Co-founder and CTO) and Goku Mohandas (ML and Product Lead) of Anyscale do a deep dive into retrieval augmented generation (RAG) and large language models (LLMs).
Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/
Subscribe: Apple • Spotify • Overcast • Google • AntennaPod • Podcast Addict • Amazon • RSS.
Detailed show notes can be found on The Data Exchange web site.
Bill Marcellino is a senior behavioral scientist at the RAND Corporation, and Nathan Beauchamp-Mustafaga, policy researcher at the RAND Corporation. They are the principal researchers behind the new report “The Rise of Generative AI and the Coming Era of Social Media Manipulation 3.0”.
Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/
Subscribe: Apple • Spotify • Overcast • Google • AntennaPod • Podcast Addict • Amazon • RSS.
Detailed show notes can be found on The Data Exchange web site.
Yucheng Low, Cofounder & CEO of XetHub, discusses the challenges of managing large-scale machine learning assets and the need for version control.
Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/
Subscribe: Apple • Spotify • Overcast • Google • AntennaPod • Podcast Addict • Amazon • RSS.
Detailed show notes can be found on The Data Exchange web site.
Christopher Nguyen is CEO and Co-founder of Aitomatic, a startup that builds virtual advisors tailored with domain-specific expertise, primarily catering to industrial AI applications.
Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/
Subscribe: Apple • Spotify • Overcast • Google • AntennaPod • Podcast Addict • Amazon • RSS.
Detailed show notes can be found on The Data Exchange web site.
Sudhir Hasbe, Chief Product Officer at Neo4j, and a longtime technical and product leader in the data management space.
Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/
Subscribe: Apple • Spotify • Overcast • Google • AntennaPod • Podcast Addict • Amazon • RSS.
Detailed show notes can be found on The Data Exchange web site.
Yishay Carmiel is the CEO of Meaning, a startup at the forefront of building real-time speech applications for enterprises. We discuss the state of AI for speech and audio, including trends in Generative AI, automatic speech recognition, diarization, restoration, voice cloning, speech synthesis and more.
Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/
Subscribe: Apple • Spotify • Overcast • Google • AntennaPod • Podcast Addict • Amazon • RSS.
Detailed show notes can be found on The Data Exchange web site.
Casey Ellis is Founder/Chair/CTO of Bugcrowd, a Crowdsourced Cybersecurity Platform. Bugcrowd recently released “Inside the Mind of a Hacker 2023”, an interesting report that provides insights into the motivations, challenges, and specializations of hackers, as well as security implications of AI.
Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/
Subscribe: Apple • Spotify • Overcast • Google • AntennaPod • Podcast Addict • Amazon • RSS.
Detailed show notes can be found on The Data Exchange web site.
Daniel Lenton is the CEO of Ivy, a suite of tools designed to accelerate AI Model Development and Model Deployment. Ivy serves as a glue that connects various frameworks and compiler infrastructures, making them compatible.
Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/
Subscribe: Apple • Spotify • Overcast • Google • AntennaPod • Podcast Addict • Amazon • RSS.
Detailed show notes can be found on The Data Exchange web site.
Andrew Burt is the Managing Partner at Luminos.Law, the first law firm focused on helping teams manage the privacy, fairness, security, and transparency of their AI and data — including generative AI systems. We explore the state of risk and compliance in light of generative AI. This episode further explores the challenges and risks posed by AI, and the implications of the FTC probe into OpenAI, as well as the NIST AI Risk Management Framework.
Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/
Subscribe: Apple • Spotify • Overcast • Google • AntennaPod • Podcast Addict • Amazon • RSS.
Detailed show notes can be found on The Data Exchange web site.
Michele Catasta is VP of AI at Replit, an AI-powered software development platform that allows teams to build and deploy applications on any device, without any setup required.
Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/
Subscribe: Apple • Spotify • Overcast • Google • AntennaPod • Podcast Addict • Amazon • RSS.
Detailed show notes can be found on The Data Exchange web site.
Alex Chao is a Product Manager at Microsoft focused on Semantic Kernel, an open-source AI and LLM orchestrator. Semantic Kernel (SK) is a lightweight SDK that makes it easy to integrate AI models and plugins into applications.
Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/
Subscribe: Apple • Spotify • Overcast • Google • AntennaPod • Podcast Addict • Amazon • RSS.
Detailed show notes can be found on The Data Exchange web site.
Steve Hsu wears many hats, but most recently he is co-founder of SuperFocus, a startup building LLM-backed knowledge co-pilots for enterprises.
Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/
Subscribe: Apple • Spotify • Overcast • Google • AntennaPod • Podcast Addict • Amazon • RSS.
Detailed show notes can be found on The Data Exchange web site.
Brian Raymond is the founder of Unstructured, a startup building open source data pre-processing and ingestion tools specifically for Large Language Models (LLMs).
Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/
Subscribe: Apple • Spotify • Overcast • Google • AntennaPod • Podcast Addict • Amazon • RSS.
Detailed show notes can be found on The Data Exchange web site.
Emil Eifrem is co-founder and CEO of Neo4j, the leading graph database and graph data science software provider. We discussed a range of topics including: the current state of graph databases, graph data science and graph neural networks, vector databases, the interplay between LLMs, knowledge graphs, and graph databases.
Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/
Subscribe: Apple • Spotify • Stitcher • Google • AntennaPod • Podcast Addict • Amazon • RSS.
Detailed show notes can be found on The Data Exchange web site.
David Talby is the CTO and Founder of John Snow Labs, the company behind two popular open source projects: Spark NLP and LangTest. In this episode we focus on LangTest, an open-source Python library designed to help developers deliver safe and effective Natural Language Processing (NLP) models. [Note: After we recorded this episode, NLTest was renamed to LangTest.]
Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/
Subscribe: Apple • Spotify • Stitcher • Google • AntennaPod • Podcast Addict • Amazon • RSS.
Detailed show notes can be found on The Data Exchange web site.
Jeff Jonas is Founder and CEO of Senzing, a startup focused on democratizing entity resolution – making this deceptively complicated task easy for programmers to use and deploy.
Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/
Subscribe: Apple • Spotify • Stitcher • Google • AntennaPod • Podcast Addict • Amazon • RSS.
Detailed show notes can be found on The Data Exchange web site.
Jerry Liu is CEO and co-founder of LlamaIndex, an open source project and startup that builds tools that enable teams to augment LLMs with their own private data.
Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/
Subscribe: Apple • Spotify • Stitcher • Google • AntennaPod • Podcast Addict • Amazon • RSS.
Detailed show notes can be found on The Data Exchange web site.
Tim Davis is the Co-Founder & Chief Product Officer of Modular, a startup focused on building tools to help simplify AI infrastructure.
Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/
Subscribe: Apple • Spotify • Stitcher • Google • AntennaPod • Podcast Addict • Amazon • RSS.
Detailed show notes can be found on The Data Exchange web site.
Andrew Feldman is CEO and co-founder of Cerebras, a startup that has released the fastest AI accelerator, based on the largest processor. We discussed Cerebras-GPT, a family of language models that have set new benchmarks for accuracy and compute efficiency, with sizes ranging from 111 million to 13 billion parameters.
Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/
Subscribe: Apple • Spotify • Stitcher • Google • AntennaPod • Podcast Addict • Amazon • RSS.
Detailed show notes can be found on The Data Exchange web site.
Louis Brandy is VP of Engineering at Rockset, the real-time search and analytics database startup formed by the creators of the popular open source project, RocksDB.
Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/
Subscribe: Apple • Spotify • Stitcher • Google • AntennaPod • Podcast Addict • Amazon • RSS.
Detailed show notes can be found on The Data Exchange web site.
Amin Ahmad, the co-founder of Vectara, has played a crucial role in developing a powerful API platform specifically tailored for developers.
Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/
Subscribe: Apple • Spotify • Stitcher • Google • AntennaPod • Podcast Addict • Amazon • RSS.
Detailed show notes can be found on The Data Exchange web site.
Jonas Andrulis is the Founder & CEO Aleph Alpha, a startup that provides enterprise software solutions backed with their own large language models and multimodal models
Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/
Subscribe: Apple • Spotify • Stitcher • Google • AntennaPod • Podcast Addict • Amazon • RSS.
Detailed show notes can be found on The Data Exchange web site.
Alex Remedios, founder of Treebeardtech, leads a London-based consulting firm dedicated to assisting machine learning teams in constructing dependable, secure, and adaptable cloud infrastructures crucial for delivering business-critical artificial intelligence solutions.
Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/
Subscribe: Apple • Spotify • Stitcher • Google • AntennaPod • Podcast Addict • Amazon • RSS.
Detailed show notes can be found on The Data Exchange web site.
Patrick Hall, is co-founder of BNH and a visiting faculty member of decision sciences at the George Washington University School of Business. Agus Sudjianto, EVP, Head of Corporate Model Risk at Wells Fargo. We explore several topics covered in the new book Machine Learning for High-Risk Applications, co-authored by Patrick and with a foreword by Agus.
Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/
Subscribe: Apple • Spotify • Stitcher • Google • AntennaPod • Podcast Addict • Amazon • RSS.
Detailed show notes can be found on The Data Exchange web site.
Omar Maher is Director of Product Marketing at Parallel Domain, a startup that is advancing machine perception capabilities by harnessing the power of synthetic data. We delve into the growing adoption of synthetic data and the factors driving its use. We discuss major developments in synthetic data generation and its overlap with Generative AI. The conversation also covers data privacy, intellectual property, the generation of structured data like LiDAR, the current state of adoption, and key research directions to overcome existing challenges.
Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/
Subscribe: Apple • Spotify • Stitcher • Google • AntennaPod • Podcast Addict • Amazon • RSS.
Detailed show notes can be found on The Data Exchange web site.
Simon Chan is the General Partner at Firsthand Alliance, a venture capital fund focused on the future of B2B and enterprise software. We explore the evolution of AI, cloud computing, and business collaboration tools, revealing how a new generation of generative AI technologies is enabling applications to generate content and drive transformative innovation across various industries.
Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/
Subscribe: Apple • Spotify • Stitcher • Google • AntennaPod • Podcast Addict • Amazon • RSS.
Detailed show notes can be found on The Data Exchange web site.
Gev Sogomonian is co-author of AimStack, an open-source, self-hosted AI metadata tracker that logs all your AI metadata, such as experiments and prompts, and provides a user-friendly UI for comparing and observing them. It also offers an SDK for programmatically querying tracked metadata.
Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/
Subscribe: Apple • Spotify • Stitcher • Google • AntennaPod • Podcast Addict • Amazon • RSS.
Detailed show notes can be found on The Data Exchange web site.
Raymond Perrault is a Distinguished Computer Scientist at SRI International, and Co-Director of the Steering Committee for the AI Index, an annual report that tracks, collates, distills, and visualizes data relating to AI, to help inform decision-makers and teams to take meaningful action for responsible and ethical AI.
Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/
Subscribe: Apple • Spotify • Stitcher • Google • AntennaPod • Podcast Addict • Amazon • RSS.
Detailed show notes can be found on The Data Exchange web site.
Hagay Lupesko, is VP Engineering at MosaicML, a startup that enables teams to easily train large AI models on their data and in their own secure environment. We discuss the the evolution of cloud based machine learning (from “traditional” ML through LLMs), his experience building machine learning applications at leading technology companies, and the need for companies to build their own custom foundation models.
Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/
Subscribe: Apple • Spotify • Stitcher • Google • AntennaPod • Podcast Addict • Amazon • RSS.
Detailed show notes can be found on The Data Exchange web site.
Jakub Zavrel is the Founder and CEO at Zeta Alpha, a premier Neural Discovery Platform that utilizes cutting-edge Neural Search technology to enhance the way you and your team uncover, arrange, and disseminate knowledge. Our conversation focuses on the latest developments in artificial intelligence, taking inspiration from their recent viral article featuring the top the 100 most cited AI papers of 2022.
Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/
Subscribe: Apple • Spotify • Stitcher • Google • AntennaPod • Podcast Addict • Amazon • RSS.
Detailed show notes can be found on The Data Exchange web site.
Chris Wiggins is a Professor at Columbia University and the Chief Data Scientist at the NYTimes. He is also co-author of a fascinating new historical exploration of how data has been used as a tool in shaping society, from the census to eugenics to Google search. How Data Happened traces the trajectory of data and explores new mathematical and computational techniques that serve to shape people, ideas, society, and economies.
Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/
Subscribe: Apple • Spotify • Stitcher • Google • AntennaPod • Podcast Addict • Amazon • RSS.
Detailed show notes can be found on The Data Exchange web site.
Paras Jain and Sarah Wooders are graduate students at UC Berkeley’s Sky Computing Lab. They are part of the team behind Skyplane, and open source project that accelerates wide-area transfers in the cloud via overlay routing and parallelism.
Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/
Subscribe: Apple • Spotify • Stitcher • Google • AntennaPod • Podcast Addict • Amazon • RSS.
Detailed show notes can be found on The Data Exchange web site.
Pablo Villalobos is a Staff Researcher at Epoch, and lead author of the recent paper “Will we run out of data? An analysis of the limits of scaling datasets in Machine Learning”. We discuss the key findings in this paper, as well as a related study Pablo conducted on scaling laws.
Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/
Subscribe: Apple • Spotify • Stitcher • Google • AntennaPod • Podcast Addict • Amazon • RSS.
Detailed show notes can be found on The Data Exchange web site.
Jinsung Yoon (Senior Research Scientist) and Sercan Arik (Staff Research Scientist and Manager) are part of the Google team behind EHR-Safe, a set of tools for generating highly realistic and privacy-preserving synthetic Electronic Health Records.
Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/
Subscribe: Apple • Spotify • Stitcher • Google • AntennaPod • Podcast Addict • Amazon • RSS.
Detailed show notes can be found on The Data Exchange web site.
Brandon Jenkins, Co-founder and COO of Fundrise, the largest direct-to-individuals alternative investment platform in the country. Our conversation centered on their recent foray into technology investing, specifically startup companies in the data infrastructure space.
Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/
Subscribe: Apple • Spotify • Stitcher • Google • AntennaPod • Podcast Addict • Amazon • RSS.
Detailed show notes can be found on The Data Exchange web site.
Zongheng Yang, is a researcher in the Sky Computing Lab at UC Berkeley, a multi-year research initiative that utilizes distributed systems, programming languages, security and machine learning to separate the services that a company requires from the choice of a specific cloud. He provides a detailed overview and update on SkyPilot, a groundbreaking intercloud broker that views the cloud ecosystem as a unified and integrated entity rather than a collection of disparate, largely incompatible clouds. SkyPilot enables users to run Machine Learning and Data Science batch jobs on any cloud, realize substantial cost savings, access the best hardware across clouds, and enjoy higher resource availability.
Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/
Subscribe: Apple • Spotify • Stitcher • Google • AntennaPod • Podcast Addict • Amazon • RSS.
Detailed show notes can be found on The Data Exchange web site.
Jesse Anderson, Evan Chan, and I delve into the current developments and possibilities within the realm of data engineering and platforms. As the foundation for artificial intelligence and machine learning, data plays a crucial role in the advancement of these technologies.
Download a copy of the FREE Report: https://gradientflow.com/2023trendsreport/
Subscribe: Apple • Spotify • Stitcher • Google • AntennaPod • Podcast Addict • Amazon • RSS.
Detailed show notes can be found on The Data Exchange web site.
This week we discuss AI regulations with Gabriela Zanfir-Fortuna is VP for Global Privacy at the Future of Privacy Forum, and Andrew Burt, Managing Partner at BNH, the first law firm focused on AI and Analytics.
Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/
Subscribe: Apple • Spotify • Stitcher • Google • AntennaPod • Podcast Addict • Amazon • RSS.
Detailed show notes can be found on The Data Exchange web site.
Dylan Patel is the Chief Analyst at SemiAnalysis, a boutique semiconductor research and consulting firm focused on the semiconductor supply chain from chemical inputs to fabs to design IP and strategy. In this episode, we discuss the emerging open source software stack for PyTorch that makes it easier and more accessible to implement non-Nvidia backends (see his recent post).
Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/
Subscribe: Apple • Spotify • Stitcher • Google • AntennaPod • Podcast Addict • Amazon • RSS.
Detailed show notes can be found on The Data Exchange web site.
Peter Norvig (of Google and Stanford) and Alfred Spector (of MIT) are part of the team of authors behind the must-read book Data Science in Context: Foundations, Challenges, Opportunities. We discussed their recent book and tool a deep dive into their Data Science Analysis Rubric, and we also talked about a trending topics in AI including looming regulations, synthetic data, and Large Language and Foundation Models.
Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/
Subscribe: Apple • Spotify • Stitcher • Google • AntennaPod • Podcast Addict • Amazon • RSS.
Detailed show notes can be found on The Data Exchange web site.
Percy Liang is Associate Professor of Computer Science and Statistics, and Director of the new Center for Research on Foundation Models at Stanford University. We discussed a new suit of tools (HELM) designed to help users and researchers understand language models in their totality. We also discuss recent trends in AI including the rise of Generative AI and Foundation Models.
Download a copy of our FREE 2023 Trends in Data and AI Report: https://gradientflow.com/2023trendsreport/
Subscribe: Apple • Spotify • Stitcher • Google • AntennaPod • Podcast Addict • Amazon • RSS.
Detailed show notes can be found on The Data Exchange web site.
Jenn Webb, special correspondent and managing editor at Gradient Flow, recently organized a mini-panel to discuss themes and trends for 2023. The panel consisted of myself and Mikio Braun. More information on these trends can be found in our Annual Trends Report, which is available for free download (see details below).
Download a copy of the FREE Report: https://gradientflow.com/2023trendsreport/
Subscribe: Apple • Spotify • Stitcher • Google • AntennaPod • Podcast Addict • Amazon • RSS.
Detailed show notes can be found on The Data Exchange web site.
Given the growing interest in Generative AI, we revisit a conversation with Mark Chen, Research Scientist at OpenAI and part of the team behind DALL·E 2, a new AI system that can create realistic images and art based on natural language descriptions.
Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/
Subscribe: Apple • Spotify • Stitcher • Google • AntennaPod • Podcast Addict • Amazon • RSS.
Detailed show notes can be found on The Data Exchange web site.
On this special end of the year episode, we revisit conversations with two data science leaders in the e-commerce space:
Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/
Subscribe: Apple • Spotify • Stitcher • Google • AntennaPod • Podcast Addict • Amazon • RSS.
Detailed show notes can be found on The Data Exchange web site.
Shayan Mohanty is the CEO of Watchful, a modern and interactive solution that places the control of data labeling back in the hands of data scientists, machine learning practitioners, and subject matter experts. This podcast focuses on a data management system (written in Rust) they built to support the level of automation and interactivity required to support Watchful.
Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/
Subscribe: Apple • Android • Spotify • Stitcher • Google • AntennaPod • RSS.
Detailed show notes can be found on The Data Exchange web site.
Frank Liu is Director of Operations & ML Architect at Zilliz, the company behind Milvus, an open source vector database. We discuss their recent VLDB paper (“A Cloud Native Vector Database Management System”) that describes recent updates to Milvus, as well as vector databases and vector search in general.
Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/
Subscribe: Apple • Android • Spotify • Stitcher • Google • AntennaPod • RSS.
Detailed show notes can be found on The Data Exchange web site.
Ira Cohen is co-founder, Chief Data Scientist at Anodot, a startup that uses time series tools to monitor business data in real time, so organizations can proactively resolve revenue, cost, and customer experience issues before they impact business performance. We recently wrote a well-received post that provided a detailed overview on the state of technologies for collecting, storing, and unlocking time series.
Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/
Subscribe: Apple • Android • Spotify • Stitcher • Google • AntennaPod • RSS.
Detailed show notes can be found on The Data Exchange web site.
Roy Schwartz is Professor of Natural Language Processing at The Hebrew University of Jerusalem. We discussed a recent survey paper that Roy co-wrote that presented a broad overview of existing methods to improve NLP efficiency through the lens of traditional NLP pipelines.
Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/
Subscribe: Apple • Android • Spotify • Stitcher • Google • AntennaPod • RSS.
Detailed show notes can be found on The Data Exchange web site.
On this Thanksgiving holiday weekend in the U.S., we revisit a Twitter Spaces conversation I had with
Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/
Subscribe: Apple • Android • Spotify • Stitcher • Google • AntennaPod • RSS.
Detailed show notes can be found on The Data Exchange web site.
Hung Bui is the CEO of VinAI, a premier Artificial Intelligence research-based company developing world-class products and services. Hung assembled the VinAI team just over three years ago and they are now among the Top 20 Global Companies in AI Research in 2022.
Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/
Subscribe: Apple • Android • Spotify • Stitcher • Google • AntennaPod • RSS.
Detailed show notes can be found on The Data Exchange web site.
Bob van Luijt, is CEO of SeMI Technologies, the company behind the popular vector search engine Weaviate. Bob describes their key features and core components, popular use cases, and he also provides an overview of Weaviate’s near-term roadmap. We also discuss how vector search engines compare with existing data management systems.
Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/
Subscribe: Apple • Android • Spotify • Stitcher • Google • AntennaPod • RSS.
Detailed show notes can be found on The Data Exchange web site.
Federico Garza and Max Mergenthaler Canseco are both CTOs and co-founders of Nixtla, a startup building developer-friendly software that helps data scientists deploy predictive pipelines.
Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/
Subscribe: Apple • Spotify • Stitcher • Google • AntennaPod • Podcast Addict • RSS.
Detailed show notes can be found on The Data Exchange web site.
Christopher Nguyen is CEO and cofounder of Aitomatic, a startup that uses a knowledge-first approach to build and deploy machine learning solutions, with a focus on industrial applications (manufacturing and other physical settings).
Join us at K1st World, a fantastic symposium and networking event slated for November 16 & 17. Use the discount code GRADIENTFLOW60 to attend in person or online.
Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/
Subscribe: Apple • Android • Spotify • Stitcher • Google • AntennaPod • RSS.
Detailed show notes can be found on The Data Exchange web site.
Ram Sriharsha is VP of Engineering and R&D at Pinecone, a startup that offers a fully managed vector database (not just an index). We discuss Pinecone’s new proprietary storage engine, which was first described around the time we recorded this conversation.
Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/
Subscribe: Apple • Android • Spotify • Stitcher • Google • AntennaPod • RSS.
Detailed show notes can be found on The Data Exchange web site.
Karthik Ramasamy, is the Head of Streaming at Databricks. He has extensive experience in streaming, having led teams at Twitter (Apache Heron), Splunk, and Streamlio (Apache Pulsar).
Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/
Subscribe: Apple • Android • Spotify • Stitcher • Google • AntennaPod • RSS.
Detailed show notes can be found on The Data Exchange web site.
Piotr Żelasko is Head of Research at Meaning, a startup building an AI platform using speech technologies. He has years of experience in speech technologies, both as a researcher and as a software engineer. We recorded this episode on the week of the release of Whisper, deep learning model (from OpenAI) that approaches human level robustness and accuracy on English speech recognition. Our conversation centered on Whisper and speech recognition, but also touched on the new speech data processing tools (Lhotse, k2, Icefall) that we described in our recent post.
Download a FREE copy of our recent 2022 Trends Report (Data, Machine Learning, AI): https://gradientflow.com/2022trendsreport/
Subscribe: Apple • Android • Spotify • Stitcher • Google • AntennaPod • RSS.
Detailed show notes can be found on The Data Exchange web site.
Yaron Singer is the CEO of Robust Intelligence, a company building tools to help manage and mitigate risks associated with machine learning models and applications.
Download a FREE copy of our recent 2022 Trends Report (Data, Machine Learning, AI): https://gradientflow.com/2022trendsreport/
Subscribe: Apple • Android • Spotify • Stitcher • Google • AntennaPod • RSS.
Detailed show notes can be found on The Data Exchange web site.
Yashar Behzadi is the CEO & Founder of Synthesis AI, a startup that uses synthetic data technologies to enable teams building AI applications, as well as gaming and metaverse applications.
Download a FREE copy of our recent 2022 Trends Report (Data, Machine Learning, AI): https://gradientflow.com/2022trendsreport/
Subscribe: Apple • Android • Spotify • Stitcher • Google • AntennaPod • RSS.
Detailed show notes can be found on The Data Exchange web site.
Sadegh Riazi is CEO and co-founder of CipherMode Labs, a startup building tools that enable data and machine learning teams to build and deploy models directly on encrypted data. CipherMode’s new open source project enables teams to develop and deploy machine learning algorithms using familiar tools, and thus opens up the possibility of using sensitive data in different scenarios both within an organization, and in cooperation with other organizations.
Download a FREE copy of our recent 2022 Trends Report (Data, Machine Learning, AI): https://gradientflow.com/2022trendsreport/
Subscribe: Apple • Android • Spotify • Stitcher • Google • AntennaPod • RSS.
Detailed show notes can be found on The Data Exchange web site.
John Bohannon is a Senior Director of Data Science and Head of Research at Primer AI, an end-to-end machine intelligence solution for textual data. We discussed their process of translating ML research into ML products, through the lens of several use cases.
Download a FREE copy of our recent NLP Industry Survey Results: https://gradientflow.com/2021nlpsurvey/
Subscribe: Apple • Android • Spotify • Stitcher • Google • AntennaPod • RSS.
Detailed show notes can be found on The Data Exchange web site.
Jon Udell is community lead for Steampipe, an open-source tool that populates a database table with data retrieved from APIs. They use Postgres, which means that data is easy to explore and retrieve using SQL.
Subscribe: Apple • Android • Spotify • Stitcher • Google • AntennaPod • RSS.
Detailed show notes can be found on The Data Exchange web site.
Aadyot Bhatnagar, is a Senior Research Engineer at Salesforce, and co-creator of Merlion an open source framework for applying machine learning on time series data. Merlion supports a wide range of time series learning tasks including forecasting, anomaly detection, and change point detection.
Subscribe: Apple • Android • Spotify • Stitcher • Google • AntennaPod • RSS.
Detailed show notes can be found on The Data Exchange web site.
Maarten Grootendorst, is a data scientist at IKNL, and more importantly, he’s the author of two open source libraries that I’ve come to love: BERTopic (topic modeling with transformers and c-TF-IDF) and PolyFuzz (fuzzy string matching). Both these projects bring the power of transformers and other leading edge models, and package them with simple APIs, clear documentation, and visualization tools.
Download a FREE copy of our recent NLP Industry Survey Results: https://gradientflow.com/2021nlpsurvey/
Subscribe: Apple • Android • Spotify • Stitcher • Google • AntennaPod • RSS.
Detailed show notes can be found on The Data Exchange web site.
Hamza Tahir and Adam Probst are co-creators of ZenML, an extensible open source framework for building reproducible pipelines. We discuss the current state of ZenML, the many use cases that ZenML has been designed for, and its near-term roadmap.
Download the FREE Report: State of Workflow Orchestration → https://gradientflow.com/2022-workflow-orchestration-survey/?utm_source=gradientflow&utm_medium=DEpodcast
Subscribe: Apple • Android • Spotify • Stitcher • Google • AntennaPod • RSS.
Detailed show notes can be found on The Data Exchange web site
Dr. Omri Allouche is Head of Research at Gong, a company that uses advances in NLP and speech models to identify and highlight risks and opportunities during customer interactions.
Download a FREE copy of our recent NLP Industry Survey Results: https://gradientflow.com/2021nlpsurvey/
Subscribe: Apple • Android • Spotify • Stitcher • Google • AntennaPod • RSS.
Detailed show notes can be found on The Data Exchange web site.
Danny Bickson and Amir Alush are the creators of fastdup, a very impressive free tool for surfacing duplicates, anomalies, and leakage in visual data. In line with its name, it’s fast: fastdup is written in C++ and can handle millions of images easily.
Download a FREE copy of our recent NLP Industry Survey Results: https://gradientflow.com/2021nlpsurvey/
Subscribe: Apple • Android • Spotify • Stitcher • Google • AntennaPod • RSS.
Detailed show notes can be found on The Data Exchange web site.
Mark Chen is a Research Scientist at OpenAI and part of the team behind DALL·E 2, a new AI system that can create realistic images and art based on natural language descriptions.
Download a FREE copy of our recent NLP Industry Survey Results: https://gradientflow.com/2021nlpsurvey/
Subscribe: Apple • Android • Spotify • Stitcher • Google • AntennaPod • RSS.
Detailed show notes can be found on The Data Exchange web site.
Jules Damji is lead developer advocate, and Richard Liaw is an engineering manager at Anyscale, the startup founded by the creators of Ray, the open source project that makes it simple to scale any compute-intensive Python workload.
To learn more about Ray and how to scale machine learning applications, attend the Ray Summit (San Francisco / Aug 23-24) https://www.anyscale.com/ray-summit-2022?utm_source=gradientflow&utm_medium=DEpodcast
Subscribe: Apple • Android • Spotify • Stitcher • Google • AntennaPod • RSS.
Detailed show notes can be found on The Data Exchange web site
Rick Lamers is co-Founder and CEO at Orchest, the startup behind an open source project that enables data scientists to create, manage, and execute complex end-to-end data pipelines.
Download the FREE Report: State of Workflow Orchestration → https://gradientflow.com/2022-workflow-orchestration-survey/?utm_source=gradientflow&utm_medium=DEpodcast
Subscribe: Apple • Android • Spotify • Stitcher • Google • AntennaPod • RSS.
Detailed show notes can be found on The Data Exchange web site
Devin Petersohn is CTO and co-founder of Ponder, and the creator of Modin, a fast, scalable, drop-in replacement for the popular Pandas library.
Download the FREE Report: State of Workflow Orchestration → https://gradientflow.com/2022-workflow-orchestration-survey/?utm_source=gradientflow&utm_medium=DEpodcast
Subscribe: Apple • Android • Spotify • Stitcher • Google • AntennaPod • RSS.
Detailed show notes can be found on The Data Exchange web site
Nick Schrock is founder and Elementl, the startup behind Dagster, a popular open source, data orchestration platform. We discussed recent trends in data engineering and infrastructure, and Dagster’s introduction of software-defined assets, a new approach to managing, maintaining, and orchestrating data declaratively.
Download the FREE Report: State of Workflow Orchestration → https://gradientflow.com/2022-workflow-orchestration-survey/?utm_source=gradientflow&utm_medium=DEpodcast
Subscribe: Apple • Android • Spotify • Stitcher • Google • AntennaPod • RSS.
Detailed show notes can be found on The Data Exchange web site
Edmon Begoli, leads the AI Systems R&D section at Oak Ridge National Laboratory (ORNL), where he is also a distinguished member of the ORNL research staff. Our conversation centered on his upcoming presentation at the Data+AI Summit, where he will describe the four principal categories of Adversarial AI and their future implications.
Download the FREE Report: Trends in Data, Machine Learning, and AI → https://gradientflow.com/2022trendsreport?utm_source=DEpodcast
Subscribe: Apple • Android • Spotify • Stitcher • Google • AntennaPod • RSS.
Detailed show notes can be found on The Data Exchange web site.
Haytham Abuelfutuh is co-founder and CTO of Union, a startup founded by the team behind Flyte, a popular open source project originated by Lyft. Flyte is a workflow automation platform used for many different applications, but especially as an orchestrator for machine learning applications.
Download the FREE Report: State of Workflow Orchestration → https://www.prefect.io/lp/gradientflow?utm_source=gradientflow&utm_medium=newsletter
Subscribe: Apple • Android • Spotify • Stitcher • Google • AntennaPod • RSS.
Detailed show notes can be found on The Data Exchange web site.
This week’s guest is Hilary Mason, co-founder of Hidden Door, a startup that uses AI and machine learning to help create and power role-playing games (RPG).
Download a FREE copy of our recent NLP Industry Survey Results: https://gradientflow.com/2021nlpsurvey/
Subscribe: Apple • Android • Spotify • Stitcher • Google • AntennaPod • RSS.
Detailed show notes can be found on The Data Exchange web site.
Oren Razon is CEO and co-founder of Superwise, a startup that builds tools to streamline observability for machine learning models. This episode provides a comprehensive overview of tools and best practices for deploying, monitoring, and managing machine learning models in production.
Download the FREE Report: Trends in Data, Machine Learning, and AI → https://gradientflow.com/2022trendsreport?utm_source=DEpodcast
Subscribe: Apple • Android • Spotify • Stitcher • Google • AntennaPod • RSS.
Detailed show notes can be found on The Data Exchange web site.
Jeremiah Lowin is co-founder and CEO of Prefect, the company behind the popular open source data workflow orchestration system with the same name. We discussed the major design changes in Prefect 2.0, their move towards treating “code as workflows”, data engineering challenges facing data and ML teams today, and implications of looming trends in machine learning and AI.
Download the FREE Report: State of Workflow Orchestration → https://www.prefect.io/lp/gradientflow?utm_source=gradientflow&utm_medium=newsletter
Subscribe: Apple • Android • Spotify • Stitcher • Google • AntennaPod • RSS.
Detailed show notes can be found on The Data Exchange web site.
Sebastian Raschka is lead author of a new book from Packt entitled “Machine Learning with PyTorch and Scikit-Learn”. He is also an Assistant Professor of Statistics at the University of Wisconsin (Madison), and serves as the Lead AI Educator at Grid.ai.
Download a FREE copy of our recent NLP Industry Survey Results: https://gradientflow.com/2021nlpsurvey/
Subscribe: Apple • Android • Spotify • Stitcher • Google • AntennaPod • RSS.
Detailed show notes can be found on The Data Exchange web site.
This week’s guests are Ade Fajemisin (Postdoctoral Researcher) and Donato Maragno (PhD Student) of the University of Amsterdam. They were co-authors of a recent paper (“Optimization with Constraint Learning: A Framework and Survey”) that explores how machine learning can be used to learn constraints in optimization problems.
Download the FREE Report: Trends in Data, Machine Learning, and AI → https://gradientflow.com/2022trendsreport?utm_source=DEpodcast
Subscribe: Apple • Android • Spotify • Stitcher • Google • AntennaPod • RSS.
Detailed show notes can be found on The Data Exchange web site.
This week’s guests are Barret Zoph and Liam Fedus, research scientists at Google Brain. Our conversation centered around Large Language Models (LLM), specifically recent work by Barret, Liam, and their collaborators on efficient scaling of large language models.
Download a FREE copy of our recent NLP Industry Survey Results: https://gradientflow.com/2021nlpsurvey/
Subscribe: Apple • Android • Spotify • Stitcher • Google • AntennaPod • RSS.
Detailed show notes can be found on The Data Exchange web site.
Olivia Liao is Senior Director of Data Science at Stitch Fix, a company that uses data science and expert stylists to deliver personalization at scale. We discuss how they blend data science and domain expertise, how they tune recommendations in light of logistics and supply chain constraints, and how they incorporate new developments in large language models, multimodal models and Responsible AI.
Download a FREE copy of our recent NLP Industry Survey Results: https://gradientflow.com/2021nlpsurvey/
Subscribe: Apple • Android • Spotify • Stitcher • Google • AntennaPod • RSS.
Detailed show notes can be found on The Data Exchange web site.
Jack Clark is co-director of the AI Index Steering Committee. In this episode we discuss key findings of the fifth edition of the AI Index. The report uses multiple metrics (benchmarks, publications, patents, legislation, etc.) to track progress in AI (mainly deep learning) in key areas that include computer vision, speech recognition, and language models.
Download the FREE Report: Trends in Data, Machine Learning, and AI → https://gradientflow.com/2022trendsreport?utm_source=DEpodcast
Subscribe: Apple • Android • Spotify • Stitcher • Google • AntennaPod • RSS.
Detailed show notes can be found on The Data Exchange web site.
This week’s guests are Ajay Kulkarni (CEO) and Mike Freedman (CTO), co-founders of Timescale, the startup behind the popular relational database for time-series and analytics. Mike is also a Professor of Computer Science at Princeton University. Our conversation took place a few weeks after Timescale raised a massive funding round and achieved unicorn status.
Download the FREE Report: 2022 Data Engineering Survey Report → https://gradientflow.com/2022desurvey/?utm_source=DEpodcast
Subscribe: Apple • Android • Spotify • Stitcher • Google • AntennaPod • RSS.
Detailed show notes can be found on The Data Exchange web site.
This week’s guest is Wendy Foster, Director of Engineering & Data Science at Shopify. We discussed applications of data science within Shopify, how they organize their data teams, the lifecycle of a data science project within the company, and how they approach emerging challenges like Responsible AI, large language models, and multimodal models.
Download the FREE Report: Trends in Data, Machine Learning, and AI → https://gradientflow.com/2022trendsreport?utm_source=DEpodcast
Subscribe: Apple • Android • Spotify • Stitcher • Google • AntennaPod • RSS.
Detailed show notes can be found on The Data Exchange web site.
This week’s guests are Elham Tabassi of the National Institute of Standards and Technology (NIST) and Andrew Burt, Managing Partner of BNH.ai, the first law firm focused on AI compliance, risk mitigation, and related topics. We discuss the new NIST framework – “AI Risk Management Framework” – intended for voluntary use to manage risks in the design, development and use of AI products and systems.
Download the FREE Report: Trends in Data, Machine Learning, and AI → https://gradientflow.com/2022trendsreport?utm_source=DEpodcast
Subscribe: Apple • Android • Spotify • Stitcher • Google • AntennaPod • RSS.
Detailed show notes can be found on The Data Exchange web site.
This week’s guests are Amit Sharma (Principal Researcher) and Emre Kiciman (Senior Principal Researcher) of Microsoft Research. We talk about practical applications of causal inference, a set of tools and techniques that enable data teams to draw causal conclusions based on data. Amit and Emre are part of the team behind DoWhy, a new open source library for estimating causal effects based on historical data alone, particularly useful when we cannot run an experiment because of time, expense, or ethical concerns.
Download the FREE Report: Trends in Data, Machine Learning, and AI → https://gradientflow.com/2022trendsreport?utm_source=DEpodcast
Subscribe: Apple • Android • Spotify • Stitcher • Google • AntennaPod • RSS.
Detailed show notes can be found on The Data Exchange web site.
Leo Meyerovich is founder and CEO of Graphistry, a startup building tools to democratize visual graph intelligence and graph machine learning. Leo and I recently wrote a well-received post (“What Is Graph Intelligence?”) making the case for why companies need to revisit graph analytics and graph intelligence.
Download the FREE Report: Trends in Data, Machine Learning, and AI → https://gradientflow.com/2022trendsreport?utm_source=DEpodcast
Subscribe: Apple • Android • Spotify • Stitcher • Google • AntennaPod • RSS.
Detailed show notes can be found on The Data Exchange web site.
This week’s guests are Dia Trambitas-Miron (Head of Product) and David Talby (CTO) of John Snow Labs, the startup behind the popular open source project, Spark NLP. The company also has a suite of products including an NLP platform targeted specifically for the healthcare, pharmaceutical, and biotech sectors.
Download a FREE copy of our recent NLP Industry Survey Results: https://gradientflow.com/2021nlpsurvey/
Subscribe: Apple • Android • Spotify • Stitcher • Google • AntennaPod • RSS.
Detailed show notes can be found on The Data Exchange web site.
Subscribe to The Gradient Flow Newsletter.
Simon Crosby is CTO of Swim.ai, a startup building tools (based on the Swim open source project) for next-generation data and AI applications. Swim is one of several projects (along with Ray and Akka) contributing to interest in the Actor Model for building large-scale machine learning and data applications and infrastructure.
Download the FREE Report: Trends in Data, Machine Learning, and AI → https://gradientflow.com/2022trendsreport?utm_source=DEpodcast
Subscribe: Apple • Android • Spotify • Stitcher • Google • AntennaPod • RSS.
Detailed show notes can be found on The Data Exchange web site.
Nicholas Boucher is a PhD at Cambridge University where his focus is on security including on topics like homomorphic encryption, voting systems, and adversarial machine learning. He is the lead author of a fascinating new paper – “Bad Characters: Imperceptible NLP Attacks” – which provides a taxonomy of attacks against text-based NLP models, that are based on Unicode and other encoding systems.
Download a FREE copy of our recent NLP Industry Survey Results: https://gradientflow.com/2021nlpsurvey/
Subscribe: Apple • Android • Spotify • Stitcher • Google • AntennaPod • RSS.
Detailed show notes can be found on The Data Exchange web site.
Subscribe to The Gradient Flow Newsletter.
This week’s guest is Anjali Samani, Director of Data Science and Data Intelligence at SalesForce. We first met during the early days of Faculty, one of the leading data science and AI startups in Europe. Anjali helped design and lead the early Fellowship programs at Faculty (these are intensive bootcamps that turn STEM PhDs and turn them into industrial data scientists).
Download the FREE Report: Trends in Data, Machine Learning, and AI → https://gradientflow.com/2022trendsreport?utm_source=DEpodcast
Subscribe: Apple • Android • Spotify • Stitcher • Google • AntennaPod • RSS.
Detailed show notes can be found on The Data Exchange web site.
Savin Goyal is CTO and co-founder of Outerbounds, a startup building infrastructure to help teams streamline how they build machine learning applications. Prior to starting Outerbounds, Savin and team worked at Netflix, where they were instrumental in the creation and release of Metaflow, an open source Python framework that addresses some of the challenges data scientists face around scalability and version control.
Download the FREE Report: Trends in Data, Machine Learning, and AI → https://gradientflow.com/2022trendsreport?utm_source=DEpodcast
Subscribe: Apple • Android • Spotify • Stitcher • Google • AntennaPod • RSS.
Detailed show notes can be found on The Data Exchange web site.
Moshe Wasserblat is a Senior Principal Engineer at Intel, where he serves as a Research Manager focused on NLP and Deep Learning.
Download a FREE copy of our recent NLP Industry Survey Results: https://gradientflow.com/2021nlpsurvey/
Subscribe: Apple • Android • Spotify • Stitcher • Google • AntennaPod • RSS.
Detailed show notes can be found on The Data Exchange web site.
Subscribe to The Gradient Flow Newsletter.
Gaurav Chakravorty, is a Senior Manager at Discord, where he leads the team responsible for machine learning models in the area of search and notification. Prior to discord Gaurav was a manager at Google where he led the team responsible for personalized podcast recommendations.
Download the FREE Report: Trends in Data, Machine Learning, and AI → https://gradientflow.com/2022trendsreport?utm_source=DEpodcast
Subscribe: Apple • Android • Spotify • Stitcher • Google • AntennaPod • RSS.
Detailed show notes can be found on The Data Exchange web site.
This week's guest is Mike Tung, founder and CEO of Diffbot, a startup that crawls the web and offers one of the most comprehensive knowledge graphs accessible through a variety of simple interfaces.
Detailed show notes can be found on The Data Exchange web site.
Download the FREE Report: Trends in Data, Machine Learning, and AI → https://gradientflow.com/2022trendsreport?utm_source=DEpodcast
Subscribe: Apple • Android • Spotify • Stitcher • Google • AntennaPod • RSS.
In this episode of the Data Exchange, our special correspondent and managing editor Jenn Webb organized a mini-panel composed of myself and my podcast co-organizer Mikio Braun. This conversation took place as we were assembling our list of trends for 2022.
Download the FREE Report: Trends in Data, Machine Learning, and AI → https://gradientflow.com/2022trendsreport?utm_source=DEpodcast
Subscribe: Apple • Android • Spotify • Stitcher • Google • AntennaPod • RSS.
This episode features conversations with two experts who have helped train and release models that can recognize, predict, and generate human language on the basis of very large text-based data sets. First is an excerpt of my conversation with Connor Leahy, AI Researcher at Aleph Alpha GmbH, and founding member of EleutherAI, (pronounced “ee-luther”) a collective of researchers and engineers building resources and models for researchers who work on natural language models. Next up is an excerpt from a recent conversation with Yoav Shoham, co-founder of AI21 Labs, creators of the largest language model available to developers.
Download a FREE copy of our recent NLP Industry Survey Results: https://gradientflow.com/2021nlpsurvey/
Subscribe: Apple • Android • Spotify • Stitcher • Google • AntennaPod • RSS.
Detailed show notes can be found on The Data Exchange web site.
Subscribe to The Gradient Flow Newsletter.
Azeem Ahmed, is Director of Engineering at Shopify, where he leads the team that builds the primitives and the API’s used by all data scientists, machine learning engineers, and members of Shopify's engineering team. Our conversation focused on the evolution and design of data and machine learning platforms within Shopify. Azeem and I also discussed broader trends, including the rise of modern data platforms and the maturation of data lakehouses.
Download a FREE copy of our recent NLP Industry Survey Results: https://gradientflow.com/2021nlpsurvey/
Subscribe: Apple • Android • Spotify • Stitcher • Google • AntennaPod • RSS.
Detailed show notes can be found on The Data Exchange web site.
Subscribe to The Gradient Flow Newsletter.
Christopher Nguyen is CEO and co-founder of Aitomatic, a startup building a platform for Industrial AI applications. Christopher previously held executive and leadership roles at organizations tasked with building machine learning solutions for traditional enterprises. Our conversation centered around what Christopher terms, AI Engineering – a new discipline concerned with the qualitative and quantitative design, construction, and operation of systems with artificial-intelligence capabilities.
Download a FREE copy of our recent Data Engineering Survey Results: https://gradientflow.com/2022desurvey
Subscribe: Apple • Android • Spotify • Stitcher • Google • AntennaPod • RSS.
Detailed show notes can be found on The Data Exchange web site.
Subscribe to The Gradient Flow Newsletter.
This week’s guest is Anshul Pandey, CTO and co-founder at Accern, a startup helping financial services companies build and deploy AI applications via a no-code platform. Our conversation focused on the specific challenges of building AI and NLP applications within financial services.
Download a FREE copy of our recent NLP Industry Survey Results: https://gradientflow.com/2021nlpsurvey/
Subscribe: Apple • Android • Spotify • Stitcher • Google • AntennaPod • RSS.
Detailed show notes can be found on The Data Exchange web site.
Subscribe to The Gradient Flow Newsletter.
Che Sharma is the founder and CEO of Eppo, an experimentation framework that integrates with modern data platforms (cloud lakehouses and cloud data warehouses). We discuss the importance of investing in experimentation tools and the power of having a well-oiled experimentation culture within an organization. Che also explains how modern data platforms enable applications like experimentation frameworks like Eppo.
Download a FREE copy of our recent Data Engineering Survey Results: https://gradientflow.com/2022desurvey
Subscribe: Apple • Android • Spotify • Stitcher • Google • AntennaPod • RSS.
Detailed show notes can be found on The Data Exchange web site.
Subscribe to The Gradient Flow Newsletter.
Happy Thanksgiving to listeners who celebrate it! This episode features conversations with two experts who have been applying reinforcement learning to problems in industry. First is an excerpt of my conversation with Nicolas (Nic) Hohn, Chief Data Scientist, McKinsey/QuantumBlack Australia. Nic led a team of data scientists charged with helping America’s Cup winning team, Emirates Team New Zealand, test new designs for hydrofoils – important sailing boat components that could be modified based on rules set forth by race organizers. I also include an excerpt of a conversation with Max Pumperla, Data Science Professor at IU International University of Applied Sciences, who at the time of our conversation, was also the Head of Product Research at Pathmind, a SaaS that helps businesses use reinforcement learning in real-world applications.
Subscribe: Apple • Android • Spotify • Stitcher • Google • AntennaPod • RSS.
Detailed show notes can be found on The Data Exchange web site.
Subscribe to The Gradient Flow Newsletter.
This week’s guest is Nikhil Muralidhar, a Graduate Research Assistant at Virginia Tech College of Engineering. He is the lead author of an excellent survey paper entitled “Using AntiPatterns to avoid MLOps Mistakes”.
Download a FREE copy of our recent Data Engineering Survey Results: https://gradientflow.com/2022desurvey
Subscribe: Apple • Android • Spotify • Stitcher • Google • AntennaPod • RSS.
Detailed show notes can be found on The Data Exchange web site.
Subscribe to The Gradient Flow Newsletter.
Pardhu Gunnam (CEO) and Mars Lan (CTO), are co-founders of Metaphor Data, creators of the first Modern Metadata Platform. As we noted in a previous post, a metadata fabric is the right foundation for data governance and data discovery solutions, data catalogs, and other enterprise data services. This insight resulted in several metadata systems being created within technology companies a few years ago. In fact, the team at Metaphor created one of the more popular systems – DataHub – while they were at Linkedin.
Video version has a detailed table of contents: https://www.youtube.com/watch?v=W8ZJHN77Ieg
Subscribe: Apple • Android • Spotify • Stitcher • Google • AntennaPod • RSS.
Detailed show notes can be found on The Data Exchange web site.
Subscribe to The Gradient Flow Newsletter.
This week’s guest is Yoav Shoham, co-founder of AI21 Labs, creators of the largest language model available to developers. Yoav is also a Professor Emeritus of Computer Science at Stanford University, and a serial entrepreneur who has co-founded numerous data and AI startups.
Subscribe: Apple • Android • Spotify • Stitcher • Google • AntennaPod • RSS.
Detailed show notes can be found on The Data Exchange web site.
Subscribe to The Gradient Flow Newsletter.
This week’s guest is Jeremy Stanley, co-founder and CTO of Anomalo, a startup building SaaS tools to help companies with data quality. Prior to Anomalo, Jeremy was VP of Data Science at Instacart.
Subscribe: Apple • Android • Spotify • Stitcher • Google • AntennaPod • RSS.
Detailed show notes can be found on The Data Exchange web site.
Subscribe to The Gradient Flow Newsletter.
This week’s guest is Michel Tricot, co-founder and CEO of Airbyte, a startup behind the popular open source project with the same name. While still a relatively young open source project, Airbyte has emerged a favorite among data and platform engineers tasked with building and maintaining data integration systems within companies.
Subscribe: Apple • Android • Spotify • Stitcher • Google • AntennaPod • RSS.
Detailed show notes can be found on The Data Exchange web site.
Subscribe to The Gradient Flow Newsletter.
This week’s guest is Hamel Husain, Staff Machine Learning Engineer at GitHub and a core developer for fastai. Prior to GitHub, Hamel worked on machine learning applications and systems at Airbnb and DataRobot.
Subscribe: Apple • Android • Spotify • Stitcher • Google • AntennaPod • RSS.
Detailed show notes can be found on The Data Exchange web site.
Subscribe to The Gradient Flow Newsletter.
This week’s guest is Bob Friday, VP and CTO at Mist Systems a Juniper Company. Bob is a serial entrepreneur and seasoned technologist, and at Mist his team uses data technologies, machine learning , and AI to “optimize user experiences and simplify operations across the wireless access, wired access, and SD-WAN domains”.
Bob and his team build models from structured, semi-structured, and unstructured data. They have deployed anomaly detection models that rely on deep learning (LSTMs) and have begun exploring the use of graph neural networks for a variety of use cases. They have also built and deployed systems that use recent advances in natural language models. Their virtual assistant provides insight and guidance to IT staff via a natural language conversational interface.
Subscribe: Apple • Android • Spotify • Stitcher • Google • AntennaPod • RSS.
Detailed show notes can be found on The Data Exchange web site.
Subscribe to The Gradient Flow Newsletter.
This week’s guest is Dr. Viviana Acquaviva, Associate Professor in the Physics Department at the CUNY NYC College of Technology and at the CUNY Graduate Center. She is an Astrophysicist with a strong interest in Data Science and Machine Learning.
Subscribe: Apple • Android • Spotify • Stitcher • Google • AntennaPod • RSS.
Detailed show notes can be found on The Data Exchange web site.
Subscribe to The Gradient Flow Newsletter.
This week I have my annual check-in on the state of Julia with Viral Shah, Co-founder and CEO of Julia Computing. Since we spoke last year, Julia continues to make inroads and grow its user base, and Julia Computing closed their $24M Series A round in July.
Subscribe: Apple • Android • Spotify • Stitcher • Google • AntennaPod • RSS.
Detailed show notes can be found on The Data Exchange web site.
Subscribe to The Gradient Flow Newsletter.
This week our special correspondent and editor Jenn Webb and I speak with Jike Chong and Cathy Chang, executives and seasoned leaders of data science teams. Our conversation is focused on their new book “How to Lead in Data Science”.
Subscribe: Apple • Android • Spotify • Stitcher • Google • AntennaPod • RSS.
Detailed show notes can be found on The Data Exchange web site.
Subscribe to The Gradient Flow Newsletter.
In this episode of the Data Exchange, our special correspondent and editor Jenn Webb organized a mini-panel composed of myself and Paco Nathan, author, teacher, and founder of Derwen.ai, a boutique consulting firm specializing in Data, machine learning, and AI. Of late, Paco has been doing a lot of work with graphs and as such he’s had to immerse himself in the world of graph data management technologies. This conversation is focused on what’s new with graph databases, and why there’s been a resurgence in interest in them. We also discuss use cases of graph databases, graph analytics, and graph neural networks.
Subscribe: Apple • Android • Spotify • Stitcher • Google • RSS.
Detailed show notes can be found on The Data Exchange web site.
Subscribe to The Gradient Flow Newsletter.
This week our special correspondent and editor Jenn Webb speak with Tara Kelly, Data Editor at DataJournalism.com (DJC) an organization created by the European Journalism Centre. DJC provides journalists and media groups with free resources, materials, online video courses and community forums. Most recently they created two free e-books: The Verification Handbook and an updated edition of the Data Journalism Handbook.
Subscribe: Apple • Android • Spotify • Stitcher • Google • RSS.
Detailed show notes can be found on The Data Exchange web site.
Subscribe to The Gradient Flow Newsletter.
This week’s guest are Rayid Ghani, Distinguished Career Professor in the Machine Learning Department and the Heinz College of Information Systems and Public Policy at Carnegie Mellon University, and Andrew Burt, co-founder and Managing Partner of BNH.ai, a new law firm focused on AI compliance, risk mitigation, and related topics. BNH is the first law firm run by lawyers and technologists focused on helping companies identify and mitigate risks associated with machine learning and AI.
Subscribe: Apple • Android • Spotify • Stitcher • Google • RSS.
Detailed show notes can be found on The Data Exchange web site.
Subscribe to The Gradient Flow Newsletter.
This week’s guest is Charles Martin, independent researcher and founder of Calculation Consulting, a boutique consultancy focused on data science and machine learning. Along with Michael Mahoney and Serena Peng, Charles is co-author of a recent Nature paper on new methods for evaluating and tuning deep learning models (“Predicting trends in the quality of state-of-the-art neural networks without access to training or testing data”).
Subscribe: Apple • Android • Spotify • Stitcher • Google • RSS.
Detailed show notes can be found on The Data Exchange web site.
Subscribe to The Gradient Flow Newsletter.
This week our special correspondent and editor Jenn Webb organized a mini-panel composed of myself and Jesse Anderson, Managing Director at the Big Data Institute. Jesse is the author of a recent book entitled “Data Teams: A Unified Management Model for Successful Data-Focused Teams”. This conversation was focused on key areas in data engineering.
Subscribe: Apple • Android • Spotify • Stitcher • Google • RSS.
Detailed show notes can be found on The Data Exchange web site.
Subscribe to The Gradient Flow Newsletter.
This week our managing editor Jenn Webb and I speak with Sean Taylor, Data Science Manager at Lyft. Sean was previously a research scientist and manager at Facebook where he was instrumental in the creation and release of Prophet, a very popular open source library for time-series forecasting.
Subscribe: Apple • Android • Spotify • Stitcher • Google • RSS.
Detailed show notes can be found on The Data Exchange web site.
Subscribe to The Gradient Flow Newsletter.
This week’s guests are Steven Feng, Graduate Student and Ed Hovy, Research Professor, both from the Language Technologies Institute of Carnegie Mellon University. We discussed their recent survey paper on Data Augmentation Approaches in NLP (GitHub), an active field of research on techniques for increasing the diversity of training examples without explicitly collecting new data. One key reason why such strategies are important is that augmented data can act as a regularizer to reduce overfitting when training models.
Subscribe: Apple • Android • Spotify • Stitcher • Google • RSS.
Detailed show notes can be found on The Data Exchange web site.
Subscribe to The Gradient Flow Newsletter.
This week’s guest is Brad King, CTO of Scality, a company that builds software-defined file and object storage systems for hybrid & multi-cloud settings. Storage and compute are the basic building blocks of (cloud) computing platforms and this episode highlights all the important considerations and recent innovations in storage technologies that data engineers, architects, and machine learning professionals need to know.
Subscribe: Apple • Android • Spotify • Stitcher • Google • RSS.
Detailed show notes can be found on The Data Exchange web site.
Subscribe to The Gradient Flow Newsletter.
In this episode, our managing editor Jenn Webb and I speak with Chris White, CTO of Prefect, a startup building tools to help companies build, monitor, and manage dataflows. Prefect originated from lessons Chris and his co-founder learned while they were at Capital One, where they were early users and contributors to related projects like Apache Airflow.
Subscribe: Apple • Android • Spotify • Stitcher • Google • RSS.
Detailed show notes can be found on The Data Exchange web site.
Subscribe to The Gradient Flow Newsletter.
This week’s guests are Reza Hosseini, Staff Software Engineer, and Albert Chen, Staff Data Scientist, both at Linkedin. Reza and Albert are part of the team behind the new open source library Greykite, a flexible and fast library for time-series forecasting.
Subscribe: Apple • Android • Spotify • Stitcher • Google • RSS.
Detailed show notes can be found on The Data Exchange web site.
Subscribe to The Gradient Flow Newsletter.
This week’s guest is Sercan Arik, Research Scientist at Google Cloud AI. Sercan and his collaborators recently published a paper on TabNet, a deep neural network architecture for tabular data. It uses sequential attention to select features, is explainable, and based on tests Sarjan and team have done spanning many domains, TabNet outperforms or is on par with other models (e.g., XGBoost) on classification and regression problems.
Subscribe: Apple • Android • Spotify • Stitcher • Google • RSS.
Detailed show notes can be found on The Data Exchange web site.
Subscribe to The Gradient Flow Newsletter.
This week’s guest is Connor Leahy, AI Researcher at Aleph Alpha GmbH, and founding member of EleutherAI, (pronnounced “ee-luther”) a collective of researchers and engineers building resources and models for researchers who work on natural language models. As NLP research becomes more computationally demanding and data intensive, there is a need for researchers to work together to develop tools and resources for the broader community. While relatively new, EleutherAI has already released a models and data that many researchers are benefitting from.
Subscribe: Apple • Android • Spotify • Stitcher • Google • RSS.
Detailed show notes can be found on The Data Exchange web site.
Subscribe to The Gradient Flow Newsletter.
This week’s guests are leading researchers in recommendation systems: Paolo Cremonesi is Professor of Computer Science and Maurizio Ferrari Dacrema is a Postdoc at Politecnico di Milano, where they are both part of the RecSys research group. Paolo is also the Reproducibility co-chair for the upcoming RecSys Conference.
Subscribe: Apple • Android • Spotify • Stitcher • Google • RSS.
Detailed show notes can be found on The Data Exchange web site.
Subscribe to The Gradient Flow Newsletter.
This week’s guest is Hyun Kim, co-founder and CEO of Superb AI, a startup building tools to help companies manage data across the entire machine learning application lifecycle. This includes tools to label, store, and monitor data assets that power all computer vision applications. We also discussed emerging trends in machine learning and AI including synthetic data, reinforcement learning, and self-supervised learning.
Subscribe: Apple • Android • Spotify • Stitcher • Google • RSS.
Detailed show notes can be found on The Data Exchange web site.
Subscribe to The Gradient Flow Newsletter.
This week’s guest is Nicolas (Nic) Hohn, Chief Data Scientist, McKinsey/QuantumBlack Australia. Nic led a team of data scientists charged with helping America’s Cup winning team, Emirates Team New Zealand, test new designs for hydrofoils – important sailing boat components that could be modified based on rules set forth by race organizers. More precisely the QuantumBlack team used Ray RLlib to design an AI agent that could learn to sail the boat for a given design at an optimal speed, and this AI agent proved crucial during the design process.
Subscribe: Apple • Android • Spotify • Stitcher • Google • RSS.
Detailed show notes can be found on The Data Exchange web site.
Subscribe to The Gradient Flow Newsletter.
In this episode of the Data Exchange I speak with Andrew Burt, co-founder and Managing Partner of BNH.ai, a new law firm focused on AI compliance, risk mitigation, and related topics. BNH is the first law firm run by lawyers and technologists focused on helping companies identify and mitigate risks associated with machine learning and AI.
Subscribe: Apple • Android • Spotify • Stitcher • Google • RSS.
Detailed show notes can be found on The Data Exchange web site.
Subscribe to The Gradient Flow Newsletter.
This week’s guest is Travis Addair, he previously led the team at Uber that was responsible for building Uber’s deep learning infrastructure. Travis is deeply involved with two popular open source projects related to deep learning:
Subscribe: Apple • Android • Spotify • Stitcher • Google • RSS.
Detailed show notes can be found on The Data Exchange web site.
Subscribe to The Gradient Flow Newsletter.
In this episode of the Data Exchange, I speak with Yonatan Geifman, CEO and co-founder of Deci, as well as with Ran El-Yaniv, Chief Scientist and co-founder of Deci and Professor of Computer Science at Technion.
Subscribe: Apple • Android • Spotify • Stitcher • Google • RSS.
Detailed show notes can be found on The Data Exchange web site.
Subscribe to The Gradient Flow Newsletter.
In this episode of the Data Exchange, our special correspondent and managing editor Jenn Webb organized a mini-panel composed of myself and Jerry Overton, who previously served as a DXC Fellow, Head of AI at DXC Technology. We discussed Jerry’s experience helping companies across many industries adopt data science and machine learning. We spoke about Centers of Excellence for AI, automation in the workforce, human-centered and responsible AI, and cyborgs!
Subscribe: Apple • Android • Spotify • Stitcher • Google • RSS.
Detailed show notes can be found on The Data Exchange web site.
Subscribe to The Gradient Flow Newsletter.
As the amount and importance of data grows within organizations, there is growing interest in tools that enable them to strategically utilize, manage, and unlock their data resources. This week’s guest is Steven (Steve) Touw, cofounder and CTO of Immuta, a startup that builds tools that help companies address data governance, data discovery, data privacy and security.
Subscribe: Apple • Android • Spotify • Stitcher • Google • RSS.
Detailed show notes can be found on The Data Exchange web site.
Subscribe to The Gradient Flow Newsletter.
In this episode of the Data Exchange, I speak with Davit Buniatyan, founder and CEO of ActiveLoop, a startup building data management tools for unstructured data types commonly associated with deep learning.
Subscribe: Apple • Android • Spotify • Stitcher • Google • RSS.
Detailed show notes can be found on The Data Exchange web site.
Subscribe to The Gradient Flow Newsletter.
In this episode of the Data Exchange, I speak Zhe Zhang, Engineering Manager at Anyscale where he leads the team that works on the Ray and its ecosystem of libraries and partners. Ray is an open source, general purpose framework for building distributed applications (more details in this post and video).
Subscribe: Apple • Android • Spotify • Stitcher • Google • RSS.
Detailed show notes can be found on The Data Exchange web site.
Subscribe to The Gradient Flow Newsletter.
In this episode of the Data Exchange, I speak with Abe Gong, CEO and co-founder at Superconductive, a startup founded by the team behind the Great Expectations (GE) open source project. GE is one of a growing number of tools aimed at improving data quality through tools for validation and testing. Other projects in this area include TensorFlow DV, assertr, dataframe-rules-engine, deequ, data-describe, and Apache Griffin.
Subscribe: Apple • Android • Spotify • Stitcher • Google • RSS.
Detailed show notes can be found on The Data Exchange web site.
Subscribe to The Gradient Flow Newsletter.
In this episode of the Data Exchange, I speak with Parisa Rashidi, Associate Professor at the Department of Biomedical Engineering at University of Florida. Parisa is a computer scientist and machine learning researcher who specializes in applications of ML to healthcare and biomedical domains.
Subscribe: Apple • Android • Spotify • Stitcher • Google • RSS.
Detailed show notes can be found on The Data Exchange web site.
Subscribe to The Gradient Flow Newsletter.
In this episode of the Data Exchange, our special correspondent and managing editor Jenn Webb organized a mini-panel composed of myself and Simon Rodriguez, Data Research Assistant at the Center for Security and Emerging Technology (CSET) at Georgetown University. Through a series of reports and data briefs, CSET provides policymakers with data rich material to inform and guide public policy.
Subscribe: Apple • Android • Spotify • Stitcher • Google • RSS.
Detailed show notes can be found on The Data Exchange web site.
Subscribe to The Gradient Flow Newsletter.
In this episode of the Data Exchange, I speak with Ryan Wisnesky, CTO and co-founder of Conexus, a startup that uses techniques from mathematics and incorporates them into novel tools for data integration, data management, and knowledge management.
Subscribe: Apple • Android • Spotify • Stitcher • Google • RSS.
Detailed show notes can be found on The Data Exchange web site.
Subscribe to The Gradient Flow Newsletter.
In this episode of the Data Exchange, I speak with Jian Pei, Professor, School of Computing Science, Simon Fraser University. His research spans data science, big data, data mining, and database systems. But in this podcast we talk about tools for estimating the economic value of data.
Subscribe: Apple • Android • Spotify • Stitcher • Google • RSS.
Detailed show notes can be found on The Data Exchange web site.
Subscribe to The Gradient Flow Newsletter.
In this episode of the Data Exchange, our special correspondent and managing editor Jenn Webb and I speak with Sharon Zhou, a PhD student in Computer Science at Stanford University. Sharon has been teaching very popular courses on GANs (generative adversarial networks) on Coursera. In this conversation we examine the state of Education Technology (EdTech), learning platforms, and other tools for teaching online. A year into the global pandemic, we discuss advantages and disadvantages of various technologies for delivering classes, as well as broader issues in education.
We also took the opportunity to discuss Sharon’s work on deep learning, including her work using GANs to help the general public and policy makers to better understand the implications of climate change.
Subscribe: Apple • Android • Spotify • Stitcher • Google • RSS.
Detailed show notes can be found on The Data Exchange web site.
Subscribe to The Gradient Flow Newsletter.
In this episode of the Data Exchange I speak with Sheldon Fernandez, CEO at Darwin AI, and Alex Wong, Professor at the University of Waterloo, Co-Founder of DarwinAI (Chief Scientist) and Euclid Labs.
Subscribe: Apple • Android • Spotify • Stitcher • Google • RSS.
Detailed show notes can be found on The Data Exchange web site.
Subscribe to The Gradient Flow Newsletter.
In this episode of the Data Exchange, our special correspondent and managing editor Jenn Webb organized a mini-panel composed of myself and Assaf Araki, investment manager at Intel Capital. Assaf and I have written a series of articles and this interview took place shortly before the release of our most recent collaboration: The Growing Importance of Metadata Management Systems. We devote this episode to how metadata management will impact many enterprise data systems.
Subscribe: Apple • Android • Spotify • Stitcher • Google • RSS.
Detailed show notes can be found on The Data Exchange web site.
Subscribe to The Gradient Flow Newsletter.
In this episode of the Data Exchange I speak with Michael Mahoney, a researcher at UC Berkeley’s RISELab, ICSI, and Department of Statistics. Mike and his collaborators were recently awarded one of the best papers awards at NeurIPS 2020, one of leading research conferences in machine learning.
Subscribe: Apple, Android, Spotify, Stitcher, Google, and RSS.
Download the 2021 Trends Report: Data, Machine Learning, AI and learn emerging technologies for data management, data engineering, machine learning, and AI.
Detailed show notes can be found on The Data Exchange web site.
Subscribe to The Gradient Flow Newsletter.
In this episode of the Data Exchange, our special correspondent and managing editor Jenn Webb organized a mini-panel composed of myself and Sonal Goyal, founder of Aficx, a startup that builds solutions to unify data silos for cross selling and upselling, fraud and risk management, compliance and regulatory reporting.
Subscribe: Apple • Android • Spotify • Stitcher • Google • RSS.
Detailed show notes can be found on The Data Exchange web site.
Subscribe to The Gradient Flow Newsletter.
In this episode of the Data Exchange I speak with Bruno Fernandez-Ruiz, CTO and cofounder of Nexar, Inc., a startup that uses dash cams powered by vision-based applications to improve driving and logistics.
Subscribe: Apple, Android, Spotify, Stitcher, Google, and RSS.
Download the 2021 Trends Report: Data, Machine Learning, AI and learn emerging technologies for data management, data engineering, machine learning, and AI.
Detailed show notes can be found on The Data Exchange web site.
Subscribe to The Gradient Flow Newsletter.
In this episode of the Data Exchange I speak Bharath (“Bart”) Ramsundar, author and open source developer. While in graduate school, Bart created deepchem, an open source project that aims to democratize deep learning for science.
Subscribe: Apple, Android, Spotify, Stitcher, Google, and RSS.
Download the 2020 NLP Survey Report and learn how companies are using and implementing natural language technologies.
Detailed show notes can be found on The Data Exchange web site.
Subscribe to The Gradient Flow Newsletter.
In this episode of the Data Exchange I speak with Ira Cohen: co-founder and Chief Data Scientist at Anodot, a startup that uses AI for business monitoring.
Subscribe: Apple, Android, Spotify, Stitcher, Google, and RSS.
Download the 2020 NLP Survey Report and learn how companies are using and implementing natural language technologies.
Detailed show notes can be found on The Data Exchange web site.
Subscribe to The Gradient Flow Newsletter.
In this episode of the Data Exchange I speak with Omer Dror, CEO and co-founder of Lynx.md, a startup that enables data exchanges and markets in the health and life sciences. Data exchanges match data providers and suppliers, with data buyers and users.
Subscribe: Apple, Android, Spotify, Stitcher, Google, and RSS.
Download the 2020 NLP Survey Report and learn how companies are using and implementing natural language technologies.
Detailed show notes can be found on The Data Exchange web site.
Subscribe to The Gradient Flow Newsletter.
In this episode of the Data Exchange, our special correspondent and editor Jenn Webb organized a mini-panel composed of myself and my podcast co-organizer Mikio Braun. We began our conversation by taking a look back at some of our predictions from last year which included applications of reinforcement learning, end-to-end machine learning platforms, and more. This year we organized trends in the following categories:
This episode provides a sneak peak to a formal report that comes out in early 2021. Sign-up here and we will send you a copy of our 2021 Trends Report as soon as it comes out.
Subscribe: Apple • Android • Spotify • Stitcher • Google • RSS.
Detailed show notes can be found on The Data Exchange web site.
Subscribe to The Gradient Flow Newsletter.
In this episode of the Data Exchange, our special correspondent and editor Jenn Webb organized a mini-panel composed of myself and Jesse Anderson, Managing Director at the Big Data Institute. Jesse is the author of a recent book entitled “Data Teams: A Unified Management Model for Successful Data-Focused Teams”.
Subscribe: Apple, Android, Spotify, Stitcher, Google, and RSS.
Download the 2020 NLP Survey Report and learn how companies are using and implementing natural language technologies.
Detailed show notes can be found on The Data Exchange web site.
Subscribe to The Gradient Flow Newsletter.
In this episode of the Data Exchange I speak with Dan Geer, Senior Fellow at In-Q-tel and Andrew Burt, co-founder and Managing Partner of BNH.ai and Chief Legal Officer at Immuta. Dan is one the leading experts in cybersecurity and risk management, and he has written numerous influential essays on security, privacy, and risk (examples here and here). Andrew serves as co-founder of a new law firm focused on AI compliance and related topics. BNH is the first law firm run by lawyers and technologists focused on helping companies identify and mitigate those risks.
Subscribe: Apple • Android • Spotify • Stitcher • Google • RSS.
Download the 2020 NLP Survey Report and learn how companies are using and implementing natural language technologies.
Detailed show notes can be found on The Data Exchange web site.
Subscribe to The Gradient Flow Newsletter.
In this episode of the Data Exchange I speak with Dr. Rumman Chowdhury, founder of Parity, a startup building products and services to help companies build and deploy ethical and responsible AI. Prior to starting Parity, Rumman was Global Lead for Responsible AI at Accenture Applied Intelligence.
Subscribe: Apple, Android, Spotify, Stitcher, Google, and RSS.
Download the 2020 NLP Survey Report and learn how companies are using and implementing natural language technologies.
Detailed show notes can be found on The Data Exchange web site.
Subscribe to The Gradient Flow Newsletter.
In this episode of the Data Exchange I speak with Jack Morris, a member of Google’s AI Residency program. He is co-creator of TextAttack, an open source framework for adversarial attacks, data augmentation, and adversarial training in NLP (paper, code).
Subscribe: Apple • Android • Spotify • Stitcher • Google • RSS.
Download the 2020 NLP Survey Report and learn how companies are using and implementing natural language technologies.
Detailed show notes can be found on The Data Exchange web site.
Subscribe to The Gradient Flow Newsletter.
In this episode of the Data Exchange I speak with Yishay Carmiel, an AI Leader at Avaya, a company focused on digital communications. He has long been immersed in speech technologies and conversational applications and I have frequently used him as a resource to understand the latest in speech systems. We previously co-wrote an article that listed out recommendations for teams building speech applications. We also had a previous conversation on the impact of deep learning and big data on speech technologies.
Subscribe: Apple, Android, Spotify, Stitcher, Google, and RSS.
Download the 2020 NLP Survey Report and learn how companies are using and implementing natural language technologies.
Detailed show notes can be found on The Data Exchange web site.
Subscribe to The Gradient Flow Newsletter.
In this episode of the Data Exchange I speak with Ram Shankar, a Berkman Klein Center affiliate, and a researcher and engineer who works at the intersection of Machine Learning and Security. This episode is focused on the current state of tools and techniques for securing machine learning applications.
Subscribe: Apple • Android • Spotify • Stitcher • Google • RSS.
Download the 2020 NLP Survey Report and learn how companies are using and implementing natural language technologies.
Detailed show notes can be found on The Data Exchange web site.
Subscribe to The Gradient Flow Newsletter.
In this episode of the Data Exchange I speak with Marco Ribeiro, Senior Researcher at Microsoft Research, and lead author of the award-winning paper ”Beyond Accuracy: Behavioral Testing of NLP models with CheckList”. As machine learning gains importance across many application domains and industries, there is a growing need to formalize how ML models get built, deployed, and used. MLOps is an emerging set of practices focused on productionizing the machine learning lifecycle, that draws ideas from CI/CD. But even before we talk about deploying a model to production, how do we inject more rigor into the model development process?
Subscribe: Apple, Android, Spotify, Stitcher, Google, and RSS.
Download the 2020 NLP Survey Report and learn how companies are using and implementing natural language technologies.
Detailed show notes can be found on The Data Exchange web site.
Subscribe to The Gradient Flow Newsletter.
Subscribe: Apple • Android • Spotify • Stitcher • Google • RSS.
In this episode of the Data Exchange I speak with Xinyi Zhou, a graduate student in Computer and Information Science at Syracuse University. Xinyi and her advisor (Reza Zafarani) recently wrote a comprehensive survey paper entitled “A Survey of Fake News: Fundamental Theories, Detection Methods, and Opportunities”. They set out to organize the many different methods and perspectives used to detect fake news. Their paper is a great resource for anyone wanting to understand the strengths and limitations of various state-of-the-art techniques, and a feel for where the research community might be headed in the near future.
Download the 2020 NLP Survey Report and learn how companies are using and implementing natural language technologies.
Detailed show notes can be found on The Data Exchange web site.
Subscribe to The Gradient Flow Newsletter.
Subscribe: Apple, Android, Spotify, Stitcher, Google, and RSS.
In this episode of the Data Exchange I speak with Neil Thompson, Research Scientist at Computer Science and Artificial Intelligence Lab (CSAIL) and the Initiative on the Digital Economy, both at MIT. I wanted Neil on the podcast to discuss a recent paper he co-wrote entitled “The Computational Limits of Deep Learning” (summary version here). This paper provides estimates of the amount of computation, economic costs, and environmental impact that come with increasingly large and more accurate deep learning models.
Download the 2020 NLP Survey Report and learn how companies are using and implementing natural language technologies.
Detailed show notes can be found on The Data Exchange web site.
Subscribe to The Gradient Flow Newsletter.
Subscribe: iTunes, Android, Spotify, Stitcher, Google, and RSS.
In this episode of the Data Exchange I speak with Piero Molino, creator of Ludwig, a toolbox that allows users to train and test deep learning models through a declarative interface. Piero created Ludwig while serving as a Senior Research Scientist at Uber AI. He originally created Ludwig for his personal use and it slowly garnered users within Uber. By the time it was open sourced in early 2019, the project immediately found a receptive audience in the conferences I was chairing at the time.
Download the 2020 NLP Survey Report and learn how companies are using and implementing natural language technologies.
Detailed show notes can be found on The Data Exchange web site.
Subscribe to The Gradient Flow Newsletter.
Subscribe: iTunes, Android, Spotify, Stitcher, Google, and RSS.
In this episode of the Data Exchange I speak with Mayank Kejriwal, a Research Assistant Professor in the Department of Industrial and Systems Engineering, and a Research Lead at the USC Information Sciences Institute. The focus of our conversation is knowledge graphs, a collection of linked entities (objects, events, concepts) that is used in many AI applications. For example, Google uses a knowledge graph to enhance its search engine results with infoboxes that appear in some search results. Other areas where knowledge graphs are common include e-commerce, healthcare, and financial services.
Download the 2020 NLP Survey Report and learn how companies are using and implementing natural language technologies.
Detailed show notes can be found on The Data Exchange web site.
Subscribe to The Gradient Flow Newsletter.
Subscribe: iTunes, Android, Spotify, Stitcher, Google, and RSS.
In this episode of the Data Exchange I speak with Murat Özbayoğlu, Chair of Artificial Intelligence Engineering at TOBB University of Economics and Technology in Ankara, Turkey. I wanted Murat on to discuss two survey papers he and his colleagues wrote on the use of deep learning in finance.
I’ve long been fascinated with finance and trading. My first job after I left academia was as the lead quant in a hedge fund, and ever since, I’ve tried to stay abreast of what tools and techniques quants and data scientists in finance are using. Forecasting in this setting usually means price prediction or price movement (trend) prediction. Output of forecasting models are used to inform investment decisions. What makes finance particularly challenging is that many people are using the same underlying data (time series of prices/values), and thus as Murat notes, many firms use alternative data sources (such as text) as potential sources of additional signal.
Download the 2020 NLP Survey Report and learn how companies are using and implementing natural language technologies.
Detailed show notes can be found on The Data Exchange web site.
Subscribe to The Gradient Flow Newsletter.
Subscribe: iTunes, Android, Spotify, Stitcher, Google, and RSS.
In this episode of the Data Exchange I speak with Viral Shah, co-founder and CEO, Julia Computing. Along with his Julia language co-creators, Viral was awarded the 2019 Wilkinson prize, for outstanding contributions in the field of numerical software. I first tweeted about Julia at the beginning of March 2012 after seeing Jeff Bezanson give a talk in Stanford. I’ve dabbled with it here and there, but have never used it for a major project. Over the past few years, Julia continued to add packages at a steady pace and the package manager is really quite impressive and solid. We spent much of the podcast discussing the state of Julia, Julia 1.5, and the Julia ecosystem and community.
Download the 2020 NLP Survey Report and learn how companies are using and implementing natural language technologies.
Detailed show notes can be found on The Data Exchange web site.
Subscribe to The Gradient Flow Newsletter.
Subscribe: iTunes, Android, Spotify, Stitcher, Google, and RSS.
In this episode of the Data Exchange I speak with Kira Radinsky, Chairwoman & Chief Technology Officer at Diagnostic Robotics, a startup using AI to build a medical-grade triage and clinical-predictions platform. She is also a visiting Professor at Technion – Israel Institute of Technology. Kira has extensive experience using data science and machine learning in a variety of settings, and she was one of the pioneers in using alternative data sources to augment forecasting models. Her earlier work includes models to predict social unrest as well as disease outbreaks. The global pandemic has increased the need for experts in medical data mining, a field where Kira has made many significant contributions to.
Download the 2020 NLP Survey Report and learn how companies are using and implementing natural language technologies.
Detailed show notes can be found on The Data Exchange web site.
Subscribe to The Gradient Flow Newsletter.
Subscribe: iTunes, Android, Spotify, Stitcher, Google, and RSS.
In this episode of the Data Exchange I speak with Max Pumperla, deep learning engineer at Pathmind and a contributor to many open source projects in data science and machine learning. Max is speaking on applications of reinforcement learning to simulation problems at the upcoming Ray Summit, a free virtual conference scheduled for Sep 30th and Oct 1st. Earlier this year I had Pathmind’s CEO Chris Nicholson on this podcast and he described how reinforcement learning might play a role in simulation problems. In this episode, Max provides an update and a technical description of how Pathmind uses reinforcement learning, RLLib, and Tune, to help users of AnyLogic, a widely used software for simulations in business applications.
Detailed show notes can be found on The Data Exchange web site.
Subscribe to The Gradient Flow Newsletter.
Subscribe: iTunes, Android, Spotify, Stitcher, Google, and RSS.
In this episode of the Data Exchange I speak with Weifeng Zhong, Senior Research Fellow at the Mercatus Center at George Mason University. He is the core maintainer of the open source Policy Change Index (PCI), a framework that uses machine learning and NLP to “process and read” large amounts of text to discern government priorities and policies. The initial PCI is focused on major policy shifts in China and uses NLP and machine learning to process and analyze the People’s Daily.
Detailed show notes can be found on The Data Exchange web site.
Subscribe to The Gradient Flow Newsletter.
Subscribe: iTunes, Android, Spotify, Stitcher, Google, and RSS.
In this episode of the Data Exchange I speak with Ofer Razon, co-Founder & CEO at Superwise, a startup focused on building tools that help companies gain more visibility and control of machine learning models in production. Ofer and Superwise are part of a group in the early stage of building tools and best practices for scaling AI operations. The goal is to help multiple stakeholders build the necessary solutions to evaluate models, receive alerts and troubleshoot on time, validate, observe, and gather insights for more efficiency. AI assurance will ultimately bring together different parts of an organization including business, data science and operational teams, legal and compliance, and privacy and security.
Detailed show notes can be found on The Data Exchange web site.
Subscribe to The Gradient Flow Newsletter.
Subscribe: iTunes, Android, Spotify, Stitcher, Google, and RSS.
In this episode of the Data Exchange I speak with Alan Nichol, co-founder and CTO of Rasa, the startup behind the popular open source framework for building conversational AI applications. I had Alan on as a guest in my old podcast, and that conversation was focused on components of Rasa and of chatbot applications. This time around we talked about the state of developer tools, as well as software engineering best practices for building conversational AI applications.
Detailed show notes can be found on The Data Exchange web site.
Subscribe to The Gradient Flow Newsletter.
Subscribe: iTunes, Android, Spotify, Stitcher, Google, and RSS.
In this episode of the Data Exchange, our special correspondent and editor Jenn Webb organized a mini-panel composed of myself and Paco Nathan, author, teacher, and founder of Derwen.ai, a boutique consulting firm specializing in Data, machine learning (ML), and AI.
Detailed show notes can be found on The Data Exchange web site.
Subscribe to The Gradient Flow Newsletter.
Subscribe: iTunes, Android, Spotify, Stitcher, Google, and RSS.
In this episode of the Data Exchange I speak with Joel Grus, Principal Engineer at the Capital Group. He previously served as a Senior Research Engineer at the Allen Institute for AI, where he was a core engineer on AllenNLP, a PyTorch-based library for NLP research. Joel is also the author of one of the most widely read books in data science – Data Science from Scratch. Joel has a new book which I recommend highly: Ten Essays on Fizz Buzz.
Detailed show notes can be found on The Data Exchange web site.
Subscribe to The Gradient Flow Newsletter.
Subscribe: iTunes, Android, Spotify, Stitcher, Google, and RSS.
In this episode of the Data Exchange I bring back Bruno Gonçalves, a data scientist working at the intersection of Data Science and Finance. Bruno was a guest on this podcast in April, when the COVID-19 cases were spiking in his home base in NYC. Prior to shifting over to data science, he spent several years as a researcher focused on mathematical models in Epidemiology – a field with a rich history dating as far back as the 1920s. I wanted to bring him back to get an update on the mathematical models being used to model the global pandemic.
Detailed show notes can be found on The Data Exchange web site.
Subscribe to The Gradient Flow Newsletter.
Subscribe: iTunes, Android, Spotify, Stitcher, Google, and RSS.
In this episode of the Data Exchange I speak with Karthik Ramasamy (Senior Director Of Engineering at Splunk) and Arun Kejariwal (experienced engineering leader). The focus of our conversation was hiring technical talent such as software engineers, developers, data scientists, architects, etc. The global pandemic has caused a global economic slowdown and massive layoffs across many industry sectors. But many companies are still hiring and companies are still competing for technical talent. In our bi-weekly newsletter, links pertaining to hiring and work culture have been very popular from the outset.
Detailed show notes can be found on The Data Exchange web site.
Subscribe to The Gradient Flow Newsletter.
Subscribe: iTunes, Android, Spotify, Stitcher, Google, and RSS.
In this episode of the Data Exchange I speak with Lauren Kunze, CEO of Pandorabots, a widely used platform for building chatbots. About four years ago I attended Bot Day in San Francisco, and at the time, chatbots were very much in the news. Today, chatbots are used across many industries and use cases, and on many types of devices. Lauren Kunze and Pandorabots have been at the forefront of many important developments in the conversational applications space. They assist many enterprises build and deploy bots, and they also create leading edge chatbots like Mitsuku.
Detailed show notes can be found on The Data Exchange web site.
Subscribe to The Gradient Flow Newsletter.
Subscribe: iTunes, Android, Spotify, Stitcher, Google, and RSS.
In this episode of the Data Exchange I speak with Ameet Talwalkar, co-founder and Chief Scientist at Determined AI1, and an Assistant Professor in the Machine Learning Department at Carnegie Mellon University. A few months ago, I spoke with one of Ameet’s co-founders (Evan Sparks), around the time they announced that they were open sourcing the Determined Training Platform (DTP). Ameet and I started off by discussing the first few months of DTP as an open source project, specifically initial feedback from users, applications and use cases that they are seeing, and much more.
Detailed show notes can be found on The Data Exchange web site.
Subscribe to The Gradient Flow Newsletter.
Subscribe: iTunes, Android, Spotify, Stitcher, Google, and RSS.
In this episode of the Data Exchange I speak with Denise Gosnell, Chief Data Officer at DataStax. Denise is also the co-author of the new book, The Practitioner’s Guide to Graph Data, which covers foundational tools and techniques needed to utilize graph technologies in production applications. This conversation is a great introduction to what has become an important class of technologies and tools. Graph technologies are used to power a wide array of applications, including recommendation engines, fraud detection systems, identity and access management, search, and many other use cases.
Detailed show notes can be found on The Data Exchange web site.
Subscribe to The Gradient Flow Newsletter.
In this episode of the Data Exchange I speak with Amy Heineike, Principal Product Architect at Primer.ai, a startup building machines that can read and write. Primer recently used their technology to build COVID-19 Primer, a web site that provides an overview of the latest research papers, media coverage, and social media conversations pertaining to COVID-19.
Detailed show notes can be found on The Data Exchange web site.
Subscribe to The Gradient Flow Newsletter.
In this episode of the Data Exchange I speak with Christopher Nguyen, CEO of Arimo (a Panasonic company). I first met Christopher in the early days of Apache Spark, Arimo was one of the first companies to embrace Spark and make it a central component of their data platform. He was also an early proponent of exploring deep learning for enterprise applications. A serial entrepreneur, Christopher was also an Engineering Director at Google where he was responsible for Google Apps.
Detailed show notes can be found on The Data Exchange web site.
Subscribe to The Gradient Flow Newsletter.
In this episode of the Data Exchange I speak with Matthew Honnibal, founder of Explosion AI, a startup focused on building developer tools for AI and natural language processing. Matthew and team are the creators of popular tools like spaCy (NLP), Thinc (lightweight deep learning library), and Prodigy (annotation and active learning).
Our conversation focused on a range of topics including:
Detailed show notes can be found on The Data Exchange web site.
Subscribe to The Gradient Flow Newsletter.
In this episode of the Data Exchange I speak with Chris Wiggins, Associate Professor at Columbia University, Chief Data Scientist at the New York Times, and co-founder of hackNY. He began his career in theoretical physics but he always had a strong interest in applying quantitative techniques to other disciplines. Early in his career he became interested in applications of machine learning to problems in biology and the health sciences.
Our conversation focused on a range of topics including:
Detailed show notes can be found on The Data Exchange web site.
Subscribe to The Gradient Flow Newsletter.
In this episode of the Data Exchange I speak with Andrew Burt, Chief Legal Officer at Immuta and co-founder and Managing Partner of BNH.ai, a new law firm focused on AI compliance and related topics. As AI and machine learning become more widely deployed, lawyers and technologists need to collaborate more closely so they can identify and mitigate liabilities and risks associated with AI. BNH is the first law firm run by lawyers and technologists focused on helping companies identify and mitigate those risks.
Our conversation focused on a range of topics including:
Detailed show notes can be found on The Data Exchange web site.
Subscribe to The Gradient Flow Newsletter.
In this episode of the Data Exchange our special correspondent and editor Jenn Webb speaks with Arum Verma, Head of Quantitative Research Solutions at Bloomberg. My first job post-academia was as lead quant in a small hedge fund. Since then, I’ve followed the industry from afar and I’ve long been interested in the role of data and models in financial services. Arun and I discussed quantitative finance when we ran into each other at the O’Reilly AI conference in London last year. He was slated to give a talk on extracting trading signals from alternative data sets, an important subject among quants.
Jenn and Arun discussed a range of topics including:
Detailed show notes can be found on The Data Exchange web site.
Subscribe to The Gradient Flow Newsletter.
In this episode of the Data Exchange I speak with Harish Doddi, cofounder of Datatron, a startup focused on helping companies operationalize machine learning. Over the past two years, Harish has worked closely with enterprises to understand their needs in the areas of model operations and model governance. Last year Harish and I, along with David Talby, wrote two articles on these topics. In the first article, we described these emerging areas (“What are model governance and model operations?”), and in the second we listed lessons that ML engineers can draw from two highly regulated industries (“Managing machine learning in the enterprise: Lessons from banking and health care”).
As machine learning becomes widely deployed, organizations will need to develop processes and tools to ensure that models behave as intended. This means having the right set of controls and validation steps in place.
Our conversation focused on model governance and related topics:
Detailed show notes, including a full transcript, can be found on The Data Exchange web site.
Subscribe to The Gradient Flow Newsletter.
In this episode of the Data Exchange I speak with Wes McKinney, Director of Ursa Labs and an Apache Arrow PMC Member. Wes is the creator of pandas, one of the most widely used Python libraries for data science. He is also the author of the best-selling book, “Python for Data Analysis” – a book that has become essential reading for both aspiring and experienced data scientists.
Our conversation focused on data science tools and other topics including:
Detailed show notes can be found on The Data Exchange web site.
Subscribe to The Gradient Flow Newsletter.
In this episode of the Data Exchange I speak with Pete Warden, Staff Research Engineer at Google. Pete is a prolific author and teacher, and he has made many important contributions across many open source software projects. To name just a couple of his projects: he put together the Data Science toolkit (open data sets and open-source tools for data science) and he assembled tools to help developers get started using deep learning, long before TensorFlow and PyTorch were available. Most recently, Pete has been focused on implementing machine learning in ultra-low power systems (TinyML).
Our conversation focused on TinyML and other topics including:
Detailed show notes, including a full transcript, can be found on The Data Exchange web site.
Subscribe to The Gradient Flow Newsletter.
In this episode of the Data Exchange I speak with Evan Sparks, cofounder and CEO of Determined AI, a startup that recently open sourced a platform for training deep learning models. Many of the impressive results and applications of deep learning have happened at a handful of companies and research groups. As more companies use deep learning they are learning that infrastructure for training and transfer learning isn’t widely available.
Our conversation focused on deep learning and other topics including:
Detailed show notes, including a full transcript, can be found on The Data Exchange web site.
Subscribe to The Gradient Flow Newsletter.
In this episode of the Data Exchange I speak with Kenneth Stanley, a Senior Research Manager at Uber AI and a Professor at UCF. Ken just announced that starting in June he is starting a new research group focused on open-endedness at OpenAI. He is a pioneer in the field of neuroevolution – a method for evolving and learning neural networks through evolutionary algorithms. Ken and his colleague, Joel Lehman, wrote one of my favorite books on AI aimed at a broad audience: Why Greatness Cannot Be Planned. In this episode we discuss his upcoming move to OpenAI, as well as his recent work on open-ended algorithms.
Our conversation covered:
Detailed show notes can be found on The Data Exchange web site.
Subscribe to The Data Exchange Newsletter.
In this episode of the Data Exchange I speak with Bruno Gonçalves, a data scientist working at the intersection of Data Science and Finance. I have known Bruno for several years and we met when I recruited him to teach several extremely popular conference tutorials and talks on machine learning and deep learning. Prior to shifting over to data science, he spent several years as a researcher focused on mathematical models in Epidemiology – a field with a rich history dating as far back as the 1920s. This episode is devoted to tools and techniques for modeling epidemics.
Our conversation covered:
Detailed show notes can be found on The Data Exchange web site.
Subscribe to The Data Exchange Newsletter.
In this episode of the Data Exchange I speak with Rob Munro, CEO of Machine Learning Consulting and author of the forthcoming book, “Human-in-the-loop Machine Learning”. If you want a copy of Rob’s book, use the discount code podexchange20.
Our conversation covered:
Our goal in this podcast is to build a community of people interested in Data, Machine Learning and AI. If you have suggestions for us on what to recommend (books, conferences, links), and guests to book, please visit TheDataExchange.media site and fill out the “contact” form. The first five people who fill out the form get a free book from Manning (you can view Manning’s catalog here).
Detailed show notes can be found on The Data Exchange web site.
In this episode of the Data Exchange I speak with Chris Nicholson, founder and CEO of Pathmind, a startup applying deep reinforcement learning (DRL) to simulation problems. In a recent post I highlighted two areas where companies can begin to add DRL to their suite of tools: personalization and recommendation engines, and simulation software. My interest in the interplay between DRL and simulation software began when I came across the work of Pathmind in this area.
Our conversation focused on deep reinforcement learning and its applications:
Detailed show notes can be found on The Data Exchange web site.
In this episode of the Data Exchange I speak with Solmaz Shahalizadeh, VP and Head of Data Science and Data Platform Engineering at Shopify. Shopify is a powerhouse in ecommerce and their technology powers over a million businesses worldwide. Solmaz is a frequent speaker and presenter at conferences throughout the world and she has played a critical role in helping Shopify scale its data and machine learning infrastructure.
Our conversation covered many important technical and business topics including:
Detailed show notes can be found on The Data Exchange web site.
In this episode of the Data Exchange I speak with Edo Liberty, founder of Hypercube, a startup building tools for deploying deep learning models in search and information retrieval involving large collections. When I spoke at AI Week in Tel Aviv last November several friends encouraged me to learn more about Hypercube - I’m glad I took their advice!
Our conversation covered several topics including:
Detailed show notes can be found on The Data Exchange web site.
In this episode of the Data Exchange I speak with Alejandro Saucedo, Engineering Director at Seldon, a startup building tools for productionizing machine learning. Alejandro is also Chief Scientist at The Institute for Ethical AI & Machine Learning, a UK-based research center that conducts “research into processes and frameworks that support the responsible development, deployment and operation of machine learning systems”.
Our conversation covered Alejandro’s work at both Seldon and the Institute for Ethical AI & Machine Learning:
Detailed show notes can be found on The Data Exchange web site.
In this episode of the Data Exchange I speak with Edmon Begoli, Chief Data Architect at Oak Ridge National Laboratory (ORNL). Edmon has developed and implemented large-scale data applications on systems like Open MPI, Hadoop/MapReduce, Apache Calcite, Apache Spark, and Akka. Most recently he has been building large-scale machine learning and natural language applications with Ray, a distributed execution framework that makes it easy to scale machine learning and Python applications.
Our conversation included a range of topics, including:
Detailed show notes can be found on The Data Exchange web site.
Join Michael Jordan, Manuela Veloso, Azalia Mirhoseini, Zoubin Ghahramani, Wes McKinney, Ion Stoica, Gaël Varoquaux, and many other speakers at the first Ray Summit In San Francisco, May 27-28. Tickets start at $200.
In this episode of the Data Exchange I speak with Krishna Gade, founder and CEO at Fiddler Labs, a startup focused on helping companies build trustworthy and understandable AI solutions. Prior to founding Fiddler, Krishna led engineering teams at Pinterest and Facebook.
Our conversation included a range of topics, including:
Detailed show notes can be found on The Data Exchange web site.
Join Michael Jordan, Manuela Veloso, Azalia Mirhoseini, Zoubin Ghahramani, Wes McKinney, Ion Stoica, Gaël Varoquaux, and many other speakers at the first Ray Summit In San Francisco, May 27-28. Tickets start at $200.
In this episode of the Data Exchange I speak with Dean Wampler, Head of Developer Relations at Anyscale, the startup founded by the creators of Ray. Ray is a distributed execution framework that makes it easy to scale machine learning and Python applications. It has a very simple API and as someone who uses both Python and machine learning, Ray has been a wonderful addition to my toolbox. Dean has long been one of my favorite architects, speakers and teachers, and we have known each other since the early days of Apache Spark. He has authored numerous books and is known for his interest in Scala and programming languages, as well as in software architecture.
Our conversation spanned many topics, including:
Detailed show notes can be found on The Data Exchange web site.
For more on Ray and scalable machine learning & Python, come hear from Dean Wampler, Michael Jordan, Ion Stoica, Manuela Veloso, Wes McKinney and many other leading developers and researchers at the first Ray Summit in San Francisco (May 27-28).
In this episode of the Data Exchange I speak with Dafna Shahaf, Associate Professor at the School of Computer Science and Engineering, the Hebrew University of Jerusalem. She also runs the hyadata lab, a research group that consistently produces unique and interesting projects at the intersection of computer science, data, and the social sciences.
Our conversation included a range of topics, including:
Detailed show notes can be found on The Data Exchange web site.
In this episode of the Data Exchange I speak with David Talby, co-creator of Spark NLP, an open source, highly scalable, production grade natural language processing (NLP) library. Spark NLP has become one of the more popular NLP libraries and is available on PyPI, Conda, Maven, and Spark Packages. With recent advances in research in large-scale natural language models, there is strong interest in domain specific natural language applications. Besides their work on Spark NLP, David and his collaborators are building natural language models tuned specifically for healthcare applications.
Our conversation spanned many topics, including:
Detailed show notes can be found on The Data Exchange web site.
In this episode of the Data Exchange I speak with Morten Dahl, research scientist at Dropout Labs, a startup building a platform and tools for privacy-preserving machine learning. He is also behind TF Encrypted, an open source framework for encrypted machine learning in TensorFlow. The rise of privacy regulations like CCPA and GDPR combined with the growing importance of ML has led to a strong interest in tools and techniques for privacy-preserving machine learning among researchers and practitioners. Morten brings the unique perspective of being a longtime security researcher who has also worked as a data scientist in industry.
Our conversation spanned many topics, including:
Detailed show notes can be found on The Data Exchange web site.
Sijie Guo on how Apache Pulsar is able to handle both queuing and streaming, and both online and offline applications.
In this episode of the Data Exchange I speak with Sijie Guo, founder of StreamNative, a new startup focused on making enterprise messaging technologies - specifically Apache Pulsar - easy to use on the cloud. Sijie was previously a cofounder of Streamlio (acquired by Splunk) and prior to that he led the messaging team at Twitter. He is also the main organizer behind the Pulsar Summit (April in San Francisco), a new conference whose Call for Speakers closes on January 31st.
Our conversation spanned many topics, including:
Detailed show notes can be found on The Data Exchange web site.
The Data Exchange Podcast: Bahman Bahmani on attracting and retaining talent, and the importance of delivery-oriented teams.
In this episode of the Data Exchange I speak with Bahman Bahmani, VP of Data Science and Engineering at Rakuten, a large Japanese ecommerce and online retail company. When I first met Bahman several years ago, he was finishing up his Computer Science PhD at Stanford, and at the time he was giving technical talks on machine learning algorithms and their applications to computer security. Today he leads a large team at Rakuten, and in my opinion he has established an organizational structure, processes and an AI practice that other companies should study.
Our conversation spanned many topics, including:
Detailed show notes can be found on The Data Exchange web site.
In this episode of the Data Exchange I speak with Nir Shavit, Professor of EECS at MIT, and cofounder and CEO of Neural Magic, a startup that is creating software to enable deep neural networks to run on commodity CPUs (at GPU speeds or faster). Their initial products are focused on model inference, but they are also working on similar software for model training.
Our conversation spanned many topics, including:
Detailed show notes can be found on The Data Exchange web site.
In this episode of the Data Exchange, I speak with my podcast co-organizer Mikio Braun, data scientist at GetYourGuide, and a former machine learning researcher and data architect. Mikio and I go out on a limb and speculate about new trends in AI and Data that we think people should pay attention to in 2020.
Our conversation spanned many topics, and we listed trends in:
Detailed show notes can be found on The Data Exchange web site.
In this episode of the Data Exchange I speak with Rajat Monga, one of the founding members of the TensorFlow Engineering team. Up until recently Rajat was the engineering manager for TensorFlow at Google.
Our conversation spanned many topics, including:
[full show notes can be found on the Data Exchange web site.]
In this episode of the Data Exchange I speak with Reza Zadeh, founder and CEO of Matroid, a startup focused on making computer vision applications easy to build and deploy. Reza is also an adjunct professor at Stanford.
This particular conversation spanned many topics pertaining to computer vision, including:
[full show notes can be found on the Data Exchange site.]
In this episode of The Data Exchange, I speak with Paco Nathan, author, teacher, and founder of Derwen.ai, a boutique consulting firm specializing in Data, ML, and AI. Paco consults with companies and speaks before audiences all over the world, and I plan to have him as a frequent guest on this podcast to draw on his observations of diverse organizations.
This particular conversation spanned many topics, including:
I want this to be more than just a podcast. I want to create a community to help people make better decisions. A key part of this is getting you involved. I have ideas on how this community will grow, but as a first step, I want to ask a question related to one of the topics that Paco and I discussed: PyTorch and TensorFlow. I'd love to have you weigh in by filling out the survey form. I'll report on results and key insights in a future episode of this podcast.
[full show notes can be found on the Data Exchange site.]
En liten tjänst av I'm With Friends. Finns även på engelska.