Sveriges 100 mest populära podcasts

Super Data Science: ML & AI Podcast with Jon Krohn

Super Data Science: ML & AI Podcast with Jon Krohn

The latest machine learning, A.I., and data career topics from across both academia and industry are brought to you by host Dr. Jon Krohn on the Super Data Science Podcast. As the quantity of data on our planet doubles every couple of years and with this trend set to continue for decades to come, there's an unprecedented opportunity for you to make a meaningful impact in your lifetime. In conversation with the biggest names in the data science industry, Jon cuts through hype to fuel that professional impact. Whether you're curious about getting started in a data career or you're a deep technical expert, whether you'd like to understand what A.I. is or you'd like to integrate more data-driven processes into your business, we have inspiring guests and lighthearted conversation for you to enjoy. We cover tools, techniques, and implementation tricks across data collection, databases, analytics, predictive modeling, visualization, software engineering, real-world applications, commercialization, and entrepreneurship ? everything you need to crush it with data science.

Prenumerera

iTunes / Overcast / RSS

Webbplats

superdatascience.com/podcast

Avsnitt

717: Overcoming Adversaries with A.I. for Cybersecurity, with Dr. Dan Shiebler

Dr. Dan Shiebler, Head of ML at Abnormal Security, joins Jon Krohn this week and unveils the intricacies of cybercrime detection and email protection, and the role of AI in future challenges. This episode is brought to you by Grafbase (https://grafbase.com), the unified data layer, by ODSC (https://odsc.com/), the Open Data Science Conference, and by Modelbit (https://modelbit.com), for deploying models in seconds. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information. In this episode you will learn: ? The heuristic and ?intermediate? ML models that they develop at Abnormal Security [07:08] ? How Dan uses LLMs at Abnormal Security [15:46] ? How false negatives are individually the biggest classification error to avoid in cybersecurity [20:49] ? How head-to-head competitor analysis helps refine models [34:34] ? Resilient ML in cybersecurity [38:36] ? Abnormal Security?s routine for updating their models [52:37] ? AI's impact on the urban world [1:09:57] ? How to stay updated in data science and AI [1:13:46] Additional materials: www.superdatascience.com/717
2023-09-26
Länk till avsnitt

716: Happiness and Life-Fulfillment Hacks

Jon Krohn's 94-year-old grandmother, Annie, who's bursting with life and wisdom, shares her recipe to lifelong happiness and how relationships and daily intentions play an integral role. Annie also shares her curious take on modern technology. Get inspired by her infectious joy and perspective on life. Additional materials: www.superdatascience.com/716 Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
2023-09-22
Länk till avsnitt

715: Make Better Decisions with Data, with Dr. Allen Downey

Join us as Dr. Allen Downey, renowned author and professor, shares insights from his upcoming book 'Probably Overthinking It,' breaking down underused techniques like Survival Analysis, explaining common paradoxes, and discussing the dynamic Overton Window. This episode is brought to you by the Zerve data science dev environment (https://zerve.ai), by Modelbit (https://modelbit.com), for deploying models in seconds, and by Grafbase (https://grafbase.com), the unified data layer. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information. In this episode you will learn: ? Why interpreting data is not always easy [06:21] ? What is Survival Analysis [15:32] ? Preston's Paradox [22:09] ? Are you Normal? [36:52] ? How to better prepare for rare ?Black Swan? events [42:48] ? What is an Overton Window? [53:06] ? What is the base rate fallacy? [1:23:31] ? How to protect yourself from biased samples [1:33:39] ? Simpson?s Paradox [1:42:43] Additional materials: www.superdatascience.com/715
2023-09-19
Länk till avsnitt

714: Using A.I. to Overcome Blindness and Thrive as a Data Scientist

In this Friday episode, guest Tim Albiges explores with host Jon Krohn how people with blindness can have a lucrative and fulfilling career in data science, how Tim?s PhD thesis applied machine learning to help diagnose chronic respiratory diseases, and the communication tools that blind people can use to live a full and independent life. Additional materials: www.superdatascience.com/714 Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
2023-09-15
Länk till avsnitt

713: Llama 2, Toolformer and BLOOM: Open-Source LLMs with Meta's Dr. Thomas Scialom

Artificial General Intelligence, RLHF?s application in AI, and how entrepreneurs can enter the AI industry: Meta?s AI Research Scientist Thomas Scialom gives us behind-the-scenes insights into developing Llama 2 and what?s in the works for Llama 3. With host Jon Krohn, he discusses the future of Artificial General Intelligence, why the Galactica science-focused LLM was taken down, and what he learned from it. This episode is brought to you by AWS Inferentia (https://go.aws/3zWS0au), by Grafbase (https://grafbase.com), the unified data layer, and by Modelbit (https://modelbit.com), for deploying models in seconds. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information. In this episode you will learn: ? Llama 2: Behind the Scenes of Today?s Top Open-Source LLM [05:04] ? Responsible use of Llama 2 [15:26] ? Toolformer: LLM That Learns How to Use External Tools [24:57] ? Galactica: The Science-Specific LLM and Why It Was Brought Down [36:57] ? Is AGI Around the Corner? [57:03] ? Advice for AI entrepreneurs [1:05:46] ? How Thomas develops and manages large-scale AI projects [1:14:42] Additional materials: www.superdatascience.com/713
2023-09-12
Länk till avsnitt

712: Code Llama

Code Llama might just be starting the revolution for how data scientists code. In this Five-Minute Friday, host Jon Krohn investigates the suite of models under the free-to-use Code Llama and how to find the best fit for your project?s needs. Additional materials: www.superdatascience.com/712 Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
2023-09-08
Länk till avsnitt

711: Image, Video and 3D-Model Generation from Natural Language, with Dr. Ajay Jain

In this episode, host Jon Krohn explores with his guest Ajay Jain, Co-Founder of Genmo.ai, how creative general intelligence could take the video industry by storm. They also discuss the models that got Genmo to this point, the applications of NeRF, and how understanding human psychology is so essential to developing models that output high-fidelity video. This episode is brought to you by the Zerve data science dev environment (https://zerve.ai), by Grafbase (https://grafbase.com), the unified data layer, and by Modelbit (https://modelbit.com), for deploying models in seconds. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information. In this episode you will learn: ? About Genmo.ai and the term ?creative general intelligence? [03:47] ? Why Ajay started Genmo.ai [09:26] ? The increased performance of multimodal models [21:12] ? All about Denoising Diffusion Probabilistic Models (DDPMs) [31:03] ? The application of Neural Radiance Fields (NeRF) [55:26] ? Predicting pedestrian behavior at Uber [1:01:50] ? How to save money in the process of training models [1:12:42] Additional materials: www.superdatascience.com/711
2023-09-05
Länk till avsnitt

710: LangChain: Create LLM Applications Easily in Python

Discover the power of Large Language Models with Kris Ograbek as he unravels the intricacies of LangChain and showcases a chatbot in action, all while putting our host Jon Krohn in the hot seat! Additional materials: www.superdatascience.com/710 Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
2023-09-01
Länk till avsnitt

709: Big A.I. R&D Risks Reap Big Societal Rewards, with Meta's Dr. Laurens van der Maaten

Meta's Senior Research Director, Dr. Laurens van der Maaten, takes center stage to unravel the captivating realm of AI innovation. Learn about his groundbreaking contributions, including pioneering the t-SNE dimensionality reduction technique and harnessing AI for novel protein synthesis, climate change mitigation, and wearable materials simulation. Join us to explore the transformative power of AI across diverse domains and gain a glimpse into its future societal implications. This episode is brought to you by AWS Inferentia (https://go.aws/3zWS0au), by Modelbit (https://modelbit.com), for deploying models in seconds, and by Grafbase (https://grafbase.com), the unified data layer. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information. In this episode you will learn: ? Large-scale learning of image recognition models on web data [05:05] ? Evolutionary Scale Modeling protein models [16:45] ? Fighting climate change by building an A.I. model [29:49] ? The CrypTen privacy-preserving ML framework [38:36] ? Concerns about adversarial examples [53:25] ? Laurens? t-SNE algorithm [58:56] ? How to make a big impact [1:07:25] Additional materials: www.superdatascience.com/709
2023-08-29
Länk till avsnitt

708: ChatGPT Code Interpreter: 5 Hacks for Data Scientists

On this week?s Five-Minute Friday, host Jon Krohn gives five reasons why he is so excited about ChatGPT?s Code Interpreter and walks listeners through its capabilities with a practical example. Additional materials: www.superdatascience.com/708 Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
2023-08-25
Länk till avsnitt

707: Vicuña, Gorilla, Chatbot Arena and Socially Beneficial LLMs, with Prof. Joey Gonzalez

LLM Vicuña, Chatbot Arena, and the race to increase LLM context windows: This episode?s guest Joey Gonzalez talks to Jon Krohn about developing models and platforms that leverage and improve LLMs, as well as the future of AI development and access. This episode is brought to you by the AWS Insiders Podcast (https://pod.link/1608453414), by Modelbit (https://modelbit.com), for deploying models in seconds, and by Grafbase (https://grafbase.com), the unified data layer. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information. In this episode you will learn: ? Vicuña: How the revolutionary LLM came to be [03:35] ? Chatbot Arena: The leading LLM leaderboard [09:47] ? Trusting LLM results [17:54] ? Gorilla: The open-source ChatGPT plugin alternative [32:13] ? About LMSYS and long context windows [47:48] ? Open- vs closed-source LLMs: Which is better? [1:01:39] ? Aqueduct [1:16:49] ? Founding GraphLab [1:27:02] ? How AI will positively impact society in the coming decades [1:33:23] Additional materials: www.superdatascience.com/707
2023-08-22
Länk till avsnitt

706: Large Language Model Leaderboards and Benchmarks

In this episode, Caterina Constantinescu dives deep into Large Language Models (LLMs), spotlighting top leaderboards, evaluation benchmarks, and real-world user perceptions. Plus, discover the challenges of dataset contamination and the intricacies of platforms like HELM and Chatbot Arena. Additional materials: www.superdatascience.com/706 Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
2023-08-18
Länk till avsnitt

705: Feeding the World with ML-Powered Precision Agriculture

Join Jon Krohn as he chats with Syngenta Group's Feroz Sheikh, Jeremy Groeteke, and Thomas Jung about the digital revolution in agriculture. Learn how data science is evolving farming, from precision techniques to global food solutions. A compelling blend of tech meets nature. This episode is brought to you by AWS Inferentia (https://go.aws/3zWS0au) and by Modelbit (https://modelbit.com), for deploying models in seconds. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information. In this episode you will learn: ? What is precision agriculture? [09:43] ? What is computational agronomy? [12:30] ? How Syngenta helps growers optimize yields [21:37] ? How to bridge the gap between R&D and out in the real world [33:58] ? What is generative chemistry? [37:52] ? How generative chemistry accelerates the discovery of new compounds [41:55] ? How you could make a big social impact in agriculture with data science [56:22] ? How to go about designing ML models for agriculture [1:00:27] Additional materials: www.superdatascience.com/705
2023-08-15
Länk till avsnitt

704: Jon?s ?Generative A.I. with LLMs? Hands-on Training

Take on the world of GPT and learn to develop your own, commercially successful Large Language Models (LLMs) with Jon Krohn?s comprehensive, guided training video for generative AI. Get to grips with the technology, learn which tools to use, and find out how to get an eye for business-viable models with Jon?s (ad-)free educational video. Additional materials: www.superdatascience.com/704 Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
2023-08-11
Länk till avsnitt

703: How Data Happened: A History, with Columbia Prof. Chris Wiggins

Statistics history, interdisciplinarity, and data and society. Chris Wiggins talks with Jon Krohn about the power dynamics of data, the transformation of the field of biology through data-driven approaches to genetic sequencing, and the New York Times? data science team?s cutting-edge approach to accommodating its tech stack. This episode is brought to you by the AWS Insiders Podcast (https://pod.link/1608453414) and by Modelbit (https://modelbit.com), for deploying models in seconds. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information. In this episode you will learn: ? The importance of the humanities in data science [09:18] ? How data science ?rearranges? power [17:19] ? An overview of How Data Happened [20:36] ? The controversial nature of Bayes theorem [29:16] ? Why we need to consider data ethics [34:00] ? How biology came to adopt data science into its field [45:44] ? The data science tech stack at the New York Times [49:18] Additional materials: www.superdatascience.com/703
2023-08-08
Länk till avsnitt

702: Llama 2 ? It's Time to Upgrade your Open-Source LLM

This week, Jon Krohn is examining Meta's newly released open-source large language model, Llama 2, highlighting its commercial prospects, immense capacity, model variety, and unique 'time awareness' feature. He also discusses its innovative two-stage RLHF approach that enhances its performance. Additional materials: www.superdatascience.com/702 Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
2023-08-04
Länk till avsnitt

701: Generative A.I. without the Privacy Risks (with Prof. Raluca Ada Popa)

Dr. Raluca Ada Popa, renowned computer scientist, entrepreneur, and President of Opaque Systems, joins Jon Krohn to share her insights on securely interacting with AI APIs like OpenAI's GPT-4, the pros and cons of open vs. closed-source AI development, and the seamless operation of compute pipelines across multiple clouds. This episode is brought to you by AWS Inferentia (https://go.aws/3zWS0au) and by Modelbit (https://modelbit.com), for deploying models in seconds. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information. In this episode you will learn: ? What is a confidential computing platform? [04:31] ? How to get started with confidential computing [12:10] ? The challenges of confidential computing and LLMs [21:11] ? How to safeguard your data while using commercial LLMs like GPT-4 [38:00] ? Open-source vs closed-source [52:28] ? Raluca's PreVail cybersecurity company [1:01:50] ? Combining entrepreneurship and academic career [1:04:03] ? DARE Program [1:10:39] Additional materials: www.superdatascience.com/701
2023-08-01
Länk till avsnitt

700: "The Dream of Life" by Alan Watts

Yoga and Hindu mythology: This special episode continues the thread of our centenary episodes, SDS 500: Yoga Nidra with Jes Allen and SDS 600: Yoga Nidra Practice with Steve Fazzari, which talked through guided meditation techniques to help improve posture, sleep, and expand consciousness. Inspired by these sessions, host Jon Krohn explores Hindu mythology via Alan Watts? ?The Dream of Life?. Additional materials: www.superdatascience.com/700 Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
2023-07-28
Länk till avsnitt

699: The Modern Data Stack, with Harry Glaser

Model deployment, data warehouse options for running models, and how to best leverage BI tools: Harry Glaser and Jon Krohn discuss Modelbit?s capabilities to automate ML models from notebooks into production-ready models, reducing the time and effort in ?translating? information from one mode to another. Harry?s conversation with host Jon Krohn expanded on the importance of automating this task, and how developments in ML modeling have widened access to entire teams to analyze data, whatever their level of expertise. This episode is brought to you by the AWS Insiders Podcast (https://pod.link/1608453414). Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information. In this episode you will learn: ? What the modern data stack is [03:28] ? Version control for data scientists [13:30] ? CI/CD, load balancing and logging [20:38] ? Snowflake vs. Redshift [30:10] ? How tools like Looker and Tableau help monitor models [35:26] Additional materials: www.superdatascience.com/699
2023-07-25
Länk till avsnitt

698: How Firms Can Actually Adopt A.I., with Rehgan Avon

Company-wide AI adoption can take a lot of persuasion. Rehgan Avon talks to host Jon Krohn about why AI has become necessary for forward-thinking businesses and the steps to implement AI in an institution so that everyone benefits. Additional materials: www.superdatascience.com/698 Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
2023-07-21
Länk till avsnitt

697: The (Short) Path to Artificial General Intelligence, with Dr. Ben Goertzel

AI visionary and CEO of SingularityNET Dr. Ben Goertzel provides a deep dive into the possible realization of Artificial General Intelligence (AGI) within 3-7 years. Explore the intriguing connections between self-awareness, consciousness, and the future of Artificial Super Intelligence (ASI) and discover the transformative societal changes that could arise. This episode is brought to you by AWS Inferentia (https://go.aws/3zWS0au), the AWS Insiders Podcast (https://pod.link/1608453414), and by Modelbit (https://modelbit.com), for deploying models in seconds. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information. In this episode you will learn: ? Decentralized and benevolent AGI [03:13] ? The SingularityNET ecosystem [13:10] ? Dr. Goertzel's vision for realizing AGI - combining DL with neuro-symbolic systems, genetic algorithms and knowledge graphs [25:50] ? How reaching AGI will trigger Artificial Super Intelligence [38:51] ? Dr. Goertzel's approach to AGI using OpenCog Hyperon [42:34] ? Why Dr. Goertzel believes AGI will be positive for humankind [53:07] ? How to ensure the AGI is benevolent [1:06:43] ? How AGI or ASI may act ethically [1:13:50] Additional materials: www.superdatascience.com/697
2023-07-18
Länk till avsnitt

696: Brain-Computer Interfaces and Neural Decoding, with Prof. Bob Knight

Jon Krohn welcomes Professor Dr. Bob Knight to explore human intelligence, the prefrontal cortex, and the transformative potential of brain implants for data collection. Discover the pivotal role of machine learning in treating Parkinson's and delve into exciting future advancements. Additional materials: www.superdatascience.com/696 Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
2023-07-14
Länk till avsnitt

695: NLP with Transformers, feat. Hugging Face's Lewis Tunstall

What are transformers in AI, and how do they help developers to run LLMs efficiently and accurately? This is a key question in this week?s episode, where Hugging Face?s ML Engineer Lewis Tunstall sits down with host Jon Krohn to discuss encoders and decoders, and the importance of continuing to foster democratic environments like GitHub for creating open-source models. This episode is brought to you by the AWS Insiders Podcast (https://pod.link/1608453414), by https://WithFeeling.ai, the company bringing humanity into AI, and by Modelbit (https://modelbit.com), for deploying models in seconds. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information. In this episode you will learn: ? What a transformer is, and why it is so important for NLP [04:34] ? Different types of transformers and how they vary [11:39] ? Why it?s necessary to know how a transformer works [31:52] ? Hugging Face?s role in the application of transformers [57:10] ? Lewis Tunstall?s experience of working at Hugging Face [1:02:08] ? How and where to start with Hugging Face libraries [1:18:27] ? The necessity to democratize ML models in the future [1:25:25] Additional materials: www.superdatascience.com/695
2023-07-11
Länk till avsnitt

694: CatBoost: Powerful, efficient ML for large tabular datasets

Modeling tabular data and spreadsheets doesn?t have to be tedious with CatBoost?s open-source tree-boosting algorithm. CatBoost does what it says on the tin, blending categories with boosting that allows you to train your models faster and handle large datasets for ML tasks across multiple GPUs. In this week?s Five-Minute Friday, host Jon Krohn gets to grips with the technical components of CatBoost that give it the speed and accuracy so acclaimed by its users. Additional materials: www.superdatascience.com/694 Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
2023-07-07
Länk till avsnitt

693: YOLO-NAS: The State of the Art in Machine Vision, with Harpreet Sahota

Harpreet Sahota, a data science expert and deep learning developer at Deci AI, joins Jon Krohn to explore the fascinating realm of object detection and the revolutionary YOLO-NAS model architecture. Discover how machine vision models have evolved and the techniques driving compute-efficient edge device applications. This episode is brought to you by AWS Inferentia (https://go.aws/3zWS0au), by https://WithFeeling.ai, the company bringing humanity into AI, and by Modelbit (https://modelbit.com), for deploying models in seconds. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information. In this episode you will learn: ? What is machine vision? [07:02] ? Object detection and YOLO architectures [13:00] ? Deci's YOLO-NAS: Optimal object detection model architecture [23:39] ? Developer Relations [1:00:16] ? Harpreet's 'top-down' approach to learning Deep Learning [1:06:50] Additional materials: www.superdatascience.com/693
2023-07-04
Länk till avsnitt

692: Lossless LLM Weight Compression: Run Huge Models on a Single GPU

Join Jon as he navigates listeners through the innovative SpQR approach?a cutting-edge, lossless LLM weight compression technique that harnesses the power of quantization. Tune in as Jon delves into the four steps behind this groundbreaking method in this week's episode. Additional materials: www.superdatascience.com/692 Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
2023-06-30
Länk till avsnitt

691: A.I. Accelerators: Hardware Specialized for Deep Learning

GPUs vs CPUs, chip design and the importance of chips in AI research: This highly technical episode is for anyone who wants to learn what goes into chip development and how to get into the competitive industry of accelerator design. With advice from expert guest Ron Diamant, Senior Principal Engineer at AWS, you?ll get a breakdown of the need-to-know technical terms, what chip engineers need to think about during the design phase and what the future holds for processing hardware. This episode is brought to you by Posit, the open-source data science company (https://posit.co), by the AWS Insiders Podcast (https://pod.link/1608453414), and by https://WithFeeling.ai, the company bringing humanity into AI. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information. In this episode you will learn: ? What CPUs and GPUs are [05:29] ? The differences between accelerators used for deep learning [14:31] ? Trainium and Inferentia: AWS's A.I. Accelerators [22:10] ? If model optimizations will lead to lower demand for hardware to process them [43:14] ? How a chip designer goes about production [48:34] ? Breaking down the technical terminology for chips (accelerator interconnect, dynamic execution, collective communications) [55:29] ? The importance of AWS Neuron, a software development kit [1:15:42] ? How Ron got his foot in the door with chip design [1:26:40] Additional materials: www.superdatascience.com/691
2023-06-27
Länk till avsnitt

690: How to Catch and Fix Harmful Generative A.I. Outputs

Krishna Gade, the founder and CEO of Fiddler.AI, discusses the challenges faced by Large Language Models (LLMs) in Generative AI, including inaccuracies, biases, and privacy risks. He emphasizes the importance of monitoring to build trust in AI and highlights Fiddler's explainability algorithms and pre-built bias detection tools as vital solutions. Additional materials: www.superdatascience.com/690 Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
2023-06-23
Länk till avsnitt

689: Observing LLMs in Production to Automatically Catch Issues

Arize's Amber Roberts and Xander Song join Jon Krohn this week, sharing invaluable insights into ML Observability, drift detection, retraining strategies, and the crucial task of ensuring fairness and ethical considerations in AI development. This episode is brought to you by Posit, the open-source data science company (https://posit.co), by AWS Inferentia (go.aws/3zWS0au), and by Anaconda, the world's most popular Python distribution (https://superdatascience.com/anaconda). Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information. In this episode you will learn: ? What is ML Observability [05:07] ? What is Drift [08:18] ? The different kinds of model drift [15:31] ? How frequently production models should be retrained? [25:15] ? Arize's open-source product, Phoenix [30:49] ? How ML Observability relates to discovering model biases [50:30] ? Arize case studies [57:13] ? What is a developer advocate [1:04:51] Additional materials: www.superdatascience.com/689
2023-06-20
Länk till avsnitt

688: Six Reasons Why Building LLM Products Is Tricky

Prompt injection, prompt engineering, context windows, and more: In this week?s Five-Minute Friday, Jon explains why anyone looking to build their own product leveraging LLMs should stop to consider these and three more issues before jumping in. Phillip Carter first outlined these six issues in his article ?All the Hard Stuff Nobody Talks About when Building Products with LLMs?. Additional materials: www.superdatascience.com/688 Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
2023-06-16
Länk till avsnitt

687: Generative Deep Learning, with David Foster

Autoencoders, transformers, latent space: Learn the elements of generative AI and hear what data scientist David Foster has to say about the potential for generative AI in music, as well as the role that world models play in blending generative AI with reinforcement learning. This episode is brought to you by Posit, the open-source data science company (https://posit.co), by Anaconda, the world's most popular Python distribution (superdatascience.com/anaconda), and by https://WithFeeling.ai, the company bringing humanity into AI. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information. In this episode you will learn: ? Generative modeling vs discriminative modeling [04:21] ? Generative AI for Music [13:12] ? On the threats of AI [23:15] ? Autoencoders Explained [38:36] ? Noise in Generative AI [48:11] ? What CLIP models are (Contrastive Language-Image Pre-training) [54:07] ? What World Models are [1:00:40] ? What a Transformer is [1:11:14] ? How to use transformers for music generation [1:19:50] Additional materials: www.superdatascience.com/687
2023-06-13
Länk till avsnitt

686: Open-Source "Responsible A.I." Tools, with Ruth Yakubu

Mircosoft?s Ruth Yakubu joins Jon Krohn to discuss Responsible AI principles and the open-source Responsible AI Toolbox, allowing users to assess their models for fairness, inclusiveness, privacy, explainability, accountability, and reliability before deployment. Additional materials: www.superdatascience.com/686 Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
2023-06-09
Länk till avsnitt

685: Tools for Building Real-Time Machine Learning Applications, with Richmond Alake

Richmond Alake, a Machine Learning Architect at Slalom Build, sits down with Jon to share real-time ML insights, tools and career experiences for a high-energy and high impact episode. From his work at Slalom Build to his two AI startups, discover the software choices, ML tools, and front-end development techniques used by a leader in the field. This episode is brought to you by Posit, the open-source data science company (https://posit.co), by AWS Inferentia (go.aws/3zWS0au), and by https://WithFeeling.ai, the company bringing humanity into AI. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information. In this episode you will learn: ? What is a Machine Learning Architect? [03:09] ? Richmond's startups [12:07] ? Why Richmond started a podcast [29:51] ? Richmond's new course on feature stores [38:05] ? Why Richmond produces data science content [43:25] ? Why All Data Scientists Should Write [51:30] Additional materials: www.superdatascience.com/685
2023-06-06
Länk till avsnitt

684: Get More Language Context out of your LLM

Open-source LLMs, FlashAttention and generative AI terminology: Host Jon Krohn gives us the lift we need to explore the next big steps in generative AI. Listen to the specific way in which Stanford University?s ?exact attention? algorithm, FlashAttention, could become a competitor for GPT-4?s capabilities. Additional materials: www.superdatascience.com/684 Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
2023-06-02
Länk till avsnitt

683: Contextual A.I. for Adapting to Adversaries, with Dr. Matar Haller

Monitoring malicious, user-generated content; contextual AI; adapting to novel evasion attempts: Matar Haller speaks to Jon Krohn about the challenges of identifying, analyzing and flagging malicious information online. In this episode, Matar explains how contextual AI and a ?database of evil? can help resolve the multiple challenges of blocking dangerous content across a range of media, even those that are live-streamed. This episode is brought to you by Posit, the open-source data science company (posit.co), by Anaconda, the world's most popular Python distribution (superdatascience.com/anaconda), and by https://WithFeeling.ai, the company bringing humanity into AI. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information. In this episode you will learn: ? How ActiveFence helps its customers to moderate platform content [05:36] ? How ActiveFence finds extreme social media users trying to evade detection [16:32] ? How to monitor live-streaming content and analyze it for dangerous material [29:13] ? The technologies ActiveFence uses to run its platform [35:54] ? Matar?s experience of the Insight Fellows Program (Data Science Fellowship) [40:28] ? Leadership opportunities for women in STEM [1:00:41] ? Israel?s R&D edge for AI [1:13:19] Additional materials: www.superdatascience.com/683
2023-05-30
Länk till avsnitt

682: Business Intelligence Tools, with Mico Yuk

In this week's episode, Mico Yuk, host of 'Analytics on Fire', joins Jon Krohn to share her effective business intelligence and analytics framework, BIDS, for persuading key decision makers. She crowns one "power" tool as the analytics king and discusses emerging tools that could challenge its dominance. Tune in for unapologetic insights on future and current BI trends and happenings from the world of BI and analytics. Additional materials: www.superdatascience.com/682 Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
2023-05-26
Länk till avsnitt

681: XGBoost: The Ultimate Classifier, with Matt Harrison

Unlock the power of XGBoost by learning how to fine-tune its hyperparameters and discover its optimal modeling situations. This and more, when best-selling author and leading Python consultant Matt Harrison teams up with Jon Krohn for yet another jam-packed technical episode! Are you ready to upgrade your data science toolkit in just one hour? Tune-in now! This episode is brought to you by Pathway, the reactive data processing framework (pathway.com/?from=superdatascience), by Posit, the open-source data science company (posit.co), and by Anaconda, the world's most popular Python distribution (superdatascience.com/anaconda). Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information. In this episode you will learn: ? Matt's book ?Effective XGBoost? [07:05] ? What is XGBoost [09:09] ? XGBoost's key model hyperparameters [19:01] ? XGBoost's secret sauce [29:57] ? When to use XGBoost [34:45] ? When not to use XGBoost [41:42] ? Matt?s recommended Python libraries [47:36] ? Matt's production tips [57:57] Additional materials: www.superdatascience.com/681
2023-05-23
Länk till avsnitt

680: Automating Industrial Machines with Data Science and the Internet of Things (IoT)

Industrial machinery?s dependence on data science, tech stacks to build IoT platforms, and transitioning from data science to product: This week?s Friday episode with Allegra Alessi explores the minutiae of product ownership for the Internet of Things at packaging company Bobst. Join host Jon Krohn and his guest as they unpack how the IoT is leading factory production. Additional materials: www.superdatascience.com/680 Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
2023-05-19
Länk till avsnitt

679: The A.I. and Machine Learning Landscape, with investor George Mathew

Generative AI, MLOps, and making smart investments in AI: This week?s episode is critical listening for AI investors and generative AI creators. AI investor George Mathew talks with host Jon Krohn about the emerging generative AI stack, the critical elements of MLOps to ensure a scalable model, and the tools developers can use for a saleable product. This episode is brought to you by Posit, the open-source data science company (posit.co), by AWS Inferentia (https://go.aws/3zWS0au), and by Anaconda, the world's most popular Python distribution (superdatascience.com/anaconda). Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information. In this episode you will learn: ? Venture capital?s role in the technology startup ecosystem [05:59] ? How RLHF helps UI become more intuitive [12:53] ? The four layers of the generative AI stack [34:16] ? The risks for generative AI business founders and investors [46:50] ? How MLOps drive best practices and help implementation [56:33] ? The importance of PLG (Product Lead Growth) [1:04:15] ? How generative AI tools will impact the labor market [1:17:34] Additional materials: www.superdatascience.com/679
2023-05-16
Länk till avsnitt

678: StableLM: Open-source "ChatGPT"-like LLMs you can fit on one GPU

StableLM, the new family of open-source language models from the brilliant minds behind Stable Diffusion is out! Small, but mighty, these models have been trained on an unprecedented amount of data for single GPU LLMs. This week, Jon breaks down the mechanics of this model?see you there! Additional materials: www.superdatascience.com/678 Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
2023-05-12
Länk till avsnitt

677: Digital Analytics with Avinash Kaushik

How does one use marketing analytics to drive business success? Avinash Kaushik, Chief Strategy Officer at Croud and former Sr. Director of Global Strategic Analytics at Google joins Jon Krohn live for an exciting episode that covers the transformative power of AI, his 'four clusters of intent' framework and the value of hands-on data tools. This episode is brought to you by Pathway, the reactive data processing framework (https://pathway.com/?from=superdatascience), by Posit, the open-source data science company (https://posit.co), and by Anaconda, the world's most popular Python distribution (https://superdatascience.com/anaconda). Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information. In this episode you will learn: ? What is a chief strategy officer? [3:55] ? Brand vs performance analytics [7:23] ? Incrementality-centric marketing [32:53] ? Avinash's time at Google [37:54] ? How to maintain human-touch with AI [48:58] ? Four clusters of intent framework [1:11:28] ? Avinash's most significant career challenges [1:17:18] Additional materials: www.superdatascience.com/677
2023-05-09
Länk till avsnitt

676: The Chinchilla Scaling Laws

Chinchilla AI, and fine-tuning proprietary tasks with large language models: On this week?s Five-Minute Friday, host Jon Krohn outlines the principles of the Chinchilla Scaling Laws, the incredible power of models such as Cerebras-GPT based on these laws, and the impact of scaling on the number of viable applications and commercial use cases. Additional materials: www.superdatascience.com/676 Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
2023-05-05
Länk till avsnitt

675: Pandas for Data Analysis and Visualization

Wrangling data in Pandas, when to use Pandas, Matplotlib or Seaborn, and why you should learn to create Python packages: Jon Krohn speaks with guest Stefanie Molin, author of Hands-On Data Analysis with Pandas. This episode is brought to you by Posit, the open-source data science company (https://posit.co), and by AWS Inferentia (https://go.aws/3zWS0au). Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information. In this episode you will learn: ? The advantages of using pandas over other libraries [07:55] ? Why data wrangling in pandas is so helpful [12:05] ? Stefanie?s Data Morph library [24:27] ? When to use pandas, matplotlib, or seaborn [33:45] ? Understanding the ticker module in matplotlib [36:48] ? Where data analysts should start their learning journey [40:08] ? What it?s like being a software engineer at Bloomberg [51:19] Additional materials: www.superdatascience.com/675
2023-05-02
Länk till avsnitt

674: Parameter-Efficient Fine-Tuning of LLMs using LoRA (Low-Rank Adaptation)

Models like Alpaca, Vicuña, GPT4All-J and Dolly 2.0 have relatively small model architectures, but they're prohibitively expensive to train even on a small amount of your own data. The standard model-training protocol can also lead to catastrophic forgetting. In this week's episode, Jon explores a solution to these problems, introducing listeners to Parameter-Efficient Fine-Tuning (PEFT) and the leading approach: Low-Rank Adaptation (LoRA). Additional materials: www.superdatascience.com/674 Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
2023-04-28
Länk till avsnitt

673: Taipy, the open-source Python application builder

Vincent Gosselin, CEO and co-founder of Taipy, an open-source Python library, joins Jon Krohn to discuss how to accelerate productivity in Python and build scalable, reusable, and maintainable data pipelines. Gosselin shares his breadth of wisdom honed over his decades-long AI career. This episode is brought to you by Pathway, the reactive data processing framework (https://pathway.com/?from=superdatascience), and by Posit, the open-source data science company (https://posit.co/academy). Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information. In this episode you will learn: ? The Taipy library functionality [2:59] ? The future of data pipelines [21:40] ? Common trends of companies that are successful at adopting data pipelines [28:31] ? How no-code and low-code trends impact the data science lifecycle [33:00] ? How Vincent chose the programming languages that underpin Taipy [41:40] ? Common trends on how companies manage their data to learn from it [45:06] ? Vincent's perspective on AI winters [51:03] Additional materials: www.superdatascience.com/673
2023-04-25
Länk till avsnitt

672: Open-source "ChatGPT": Alpaca, Vicuña, GPT4All-J, and Dolly 2.0

Get started with language models: Learn about the commercial-use options available for your business in this week?s Five-Minute Friday, where host Jon Krohn discusses four models that have many of the capabilities of ChatGPT and can run at a fraction of the cost. Additional materials: www.superdatascience.com/672 Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
2023-04-21
Länk till avsnitt

671: Cloud Machine Learning

Get to grips with AWS, Azure, Google Cloud Platform on this week?s episode. Host Jon Krohn speaks with Kirill Eremenko and Hadelin de Ponteves about CloudWolf, a cloud computing educational platform that prepares students for certification in AWS (Amazon Web Services). Find out why an accreditation in cloud computing could be the safest investment for your data science career. This episode is brought to you by Posit, the open-source data science company (https://posit.co/academy), and by AWS Inferentia (https://aws.amazon.com/ec2/instance-types/inf2/?trk=bbd10c3f-c200-4629-bca8-adf6ad324c9e&sc_channel=el). Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information. In this episode you will learn: ? About CloudWolf [07:04] ? Why learning the cloud is important for data scientists [09:12] ? Is learning cloud computing complex? [22:30] ? Essential AWS services [28:31] ? Database options on AWS [33:47] ? How to run analytics on AWS [40:58] ? Why an AWS certification is so helpful [56:35] Additional materials: www.superdatascience.com/671
2023-04-18
Länk till avsnitt

670: LLaMA: GPT-3 performance, 10x smaller

How does Meta AI's natural language model, LLaMa compare to the rest? Based on the Chinchilla scaling laws, LLaMa is designed to be smaller but more performant. But how exactly does it achieve this feat? It's all done by training a small model for a longer period of time. Discover how LLaMa compares to its competition, including GPT-3, in this week's episode. Additional materials: www.superdatascience.com/670 Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
2023-04-14
Länk till avsnitt

669: Streaming, reactive, real-time machine learning

In this episode, Jon Krohn welcomes Adrian Kosowski, Co-Founder and Chief Product Officer at Pathway, who shares insights on streaming data processing and reactive data processing, and how they're shaping the future of machine learning. Tune in now for an unforgettable episode. This episode is brought to you by Posit, the open-source data science company (https://posit.co/academy), and by AWS Inferentia (https://aws.amazon.com/ec2/instance-types/inf2/?trk=bbd10c3f-c200-4629-bca8-adf6ad324c9e&sc_channel=el). Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information. In this episode you will learn: ? About Pathway's reactive data processing framework [04:45] ? Reactive data processing use cases [17:08] ? What is the difference between batch and streaming processing [33:18] ? Transformers in data engineering and data streaming [53:44] ? The benefits of Adrian's technical background as a CPO [1:04:17] ? Adrian's responsibilities and favorite tools as a CPO [1:15:25] ? Emerging ML approaches and tools for startups [1:28:49] Additional materials: www.superdatascience.com/669
2023-04-11
Länk till avsnitt

668: GPT-4: Apocalyptic stepping stone?

AI risks, RLHF, and inner alignment: GPT stands to give the business world a major boost. But with everyone racing either to develop products that incorporate GPT or use it to carry out critical tasks, what dangers could lie ahead in working with a tool that applies essentially unknowable means (inner alignments) to reach its goals? This week?s guest Jérémie Harris speaks with Jon Krohn about the essential need for anyone working with GPT to understand the impact of a system comprising inner alignments that cannot ? and may never ? be fully understood. Additional materials: www.superdatascience.com/668 Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
2023-04-07
Länk till avsnitt
Hur lyssnar man på podcast?

En liten tjänst av I'm With Friends. Finns även på engelska.
Uppdateras med hjälp från iTunes.