Podd: Machine Learning Guide

MLA 024 Code AI MCP Servers, ML Engineering

13 april 2025 | 44 min

Links

Notes and resources at ocdevel.com/mlg/mla-24
Try a walking desk stay healthy & sharp while you learn & code
Try Descript audio/video editing with AI power-tools

Tool Use in AI Code Agents

File Operations: Agents can read, edit, and search files using sophisticated regular expressions.
Executable Commands: They can recommend and perform installations like pip or npm installs, with user approval.
Browser Integration: Allows agents to perform actions and verify outcomes through browser interactions.

Model Context Protocol (MCP)

Standardization: MCP was created by Anthropic to standardize how AI tools and agents communicate with each other and with external tools.
Implementation:
- MCP Client: Converts AI agent requests into structured commands.
- MCP Server: Executes commands and sends structured responses back to the client.
Local and Cloud Frameworks:
- Local (S-T-D-I-O MCP): Examples include utilizing Playwright for local browser automation and connecting to local databases like Postgres.
- Cloud (SSE MCP): SaaS providers offer cloud-hosted MCPs to enhance external integrations.

Expanding AI Capabilities with MCP Servers

Directories: Various directories exist listing MCP servers for diverse functions beyond programming. modelcontextprotocol/servers
Use Cases:
- Automation Beyond Coding: Implementing MCPs that extend automation into non-programming tasks like sales, marketing, or personal project management.
- Creative Solutions: Encourages innovation in automating routine tasks by integrating diverse MCP functionalities.

AI Tools in Machine Learning

Automating ML Process:
- Auto ML and Feature Engineering: AI tools assist in transforming raw data, optimizing hyperparameters, and inventing new ML solutions.
- Pipeline Construction and Deployment: Facilitates the use of infrastructure as code for deploying ML models efficiently.
Active Experimentation:
- Jupyter Integration Challenges: While integrations are possible, they often lag and may not support the latest models.
- Practical Strategies: Suggests alternating between Jupyter and traditional Python files to maximize tool efficiency.
Action Plan for ML Engineers:
- Setup structured folders and documentation to leverage AI tools effectively.
- Encourage systematic exploration of MCPs to enhance both direct programming tasks and associated workflows.

MLA 023 Code AI Models & Modes

13 april 2025 | 38 min

Links

Notes and resources at ocdevel.com/mlg/mla-23
Try a walking desk stay healthy & sharp while you learn & code
Try Descript audio/video editing with AI power-tools

Model Current Leaders

According to the Aider Leaderboard (as of April 12, 2025), leading models include for vibe-coding:

Gemini 2.5 Pro Preview 03-25: most accurate and cost-effective option currently.
Claude 3.7 Sonnet: Performs well in both architect and code modes with enabled reasoning flags.
DeepSeek R1 with Claude 3.5 Sonnet: A popular combination for its balance of cost and performance between reasoning and non-reasoning tasks.

Local Models

Tools for Local Models: Ollama is the standard tool to manage local models, enabling usage without internet connectivity.
Privacy and Security: Utilizing local models enhances data security, suitable for sensitive projects or corporate environments that require data to remain onsite.
Performance Trade-offs: Local models, due to distillation and size constraints, often perform slightly worse than cloud-hosted models but offer privacy benefits.

Fine-Tuning Models

Customization: Developers can fine-tune pre-trained models to specialize them for their specific codebase, enhancing relevance and accuracy.
Advanced Usage: Suitable for long-term projects, fine-tuning helps models understand unique aspects of a project, resulting in consistent code quality improvements.

Tips and Best Practices

Judicious Use of the @ Key: Improves model efficiency by specifying the context of commands, reducing the necessity for AI-initiated searches.
- Examples include specifying file paths, URLs, or git commits to inform AI actions more precisely.
Concurrent Feature Implementation: Leverage tools like Boomerang mode to manage multiple features simultaneously, acting more as a manager overseeing several tasks at once, enhancing productivity.
Continued Learning: Staying updated with documentation, particularly Roo Code's, due to its comprehensive feature set and versatility among AI coding tools.

MLA 022 Code AI Tools

9 februari 2025 | 56 min

Links

Notes and resources at ocdevel.com/mlg/mla-22
Try a walking desk stay healthy & sharp while you learn & code
Try Descript audio/video editing with AI power-tools

I currently favor Roo Code. Plus either gemini-2.5-pro-exp-03-25 for Architect, Boomerang, or Code with large contexts. And Claude 3.7 for code with small contexts, eg Boomerang subtasks. Many others favor Cursor, Aider, or Cline. Copilot and Windsurf are less vogue lately. I found Copilot to struggle more; and their pricing - previously their winning point - is less compelling now.

Why I favor Roo. The default settings have it as stable and effective as Cline, Cursor. But you can tinker more with these settings - eg, for Gemini 2.5 I disable partial file reads (since it has a huge context window). Their modes are elegantly just custom system prompts (an oversimplification), making custom workflows very powerful. A potent example is their Boomerang Mode, which is an orchestrator that delegates planning and edit subtasks, to keep context windows tight. Boomerang mode specifically is a plugin-seller, it's incredibly powerful. Aider is still a darn decent exacto-knife, but as Roo has grown, I haven't found much need for Aider.

Tools discussed:

Other:

"Vibe coding" using AI agents in software development. It uses LLMs for code generation and project management. Developers are increasingly relying on agentic tools and IDE plugins to improve productivity.

Use of AI in Code Generation

AI tools facilitate the generation and editing of code.
Integration typically occurs within IDEs or as plugins.
These tools offer features like inline editing, bug fixing, and project scaffolding.

Evolution and Adoption

The concept is gaining popularity due to its efficiency and competitive edge in development.

Popular AI Tools for Vibe Coding Cursor

Characteristics: Most popular, stable, with advanced agentic capabilities.
Pricing: $20 per month, additional charges for power-use.
Strengths: Reliable, focuses on integrating new models effectively.

Windsurf

Characteristics: Cost-effective, a VS Code fork.
Pricing: Starts at $15, with higher usage at $60.
Strengths: Similar to Cursor, with a competitive pricing model.

GitHub Copilot

Characteristics: Operates within GitHub code spaces, developed by Microsoft.
Pricing: $10 to $40 monthly.
Strengths: Deep integration with cloud-based development environments.

Cline

Characteristics: Open-source, known for customizable features.
Pricing: BYOM (Bring Your Own Model), costs based on individual API usage.
Strengths: Community-driven, rapid development cycles.

Roo Code

Characteristics: Fast-moving, offers the latest technological advancements.
Pricing: Uses BYOM model, similar to Cline.
Strengths: Frequent updates, for users wanting cutting-edge features.

Aider

Characteristics: CLI-based, focuses on precision and minimal token usage.
Pricing: BYOM, efficient token usage strategies.
Strengths: High accuracy for small adjustments, good for backup use.

Choosing the Right Tool

Beginner Recommendation: Start with Cursor for reliability.
Experimentation: Try Copilot and Windsurf for comparisons.
Advanced Configuration: Use Kline or Roo Code for sophisticated tasks and ER for precise adjustments.

Cost Management

Open Router: Centralize API billing to manage interactions across multiple models, preventing fragmented payments.

MLG 033 Transformers

9 februari 2025 | 43 min

Links:

Notes and resources at ocdevel.com/mlg/33
3Blue1Brown videos: https://3blue1brown.com/
Try a walking desk stay healthy & sharp while you learn & code
Try Descript audio/video editing with AI power-tools

Background & Motivation

RNN Limitations: Sequential processing prevents full parallelization—even with attention tweaks—making them inefficient on modern hardware.
Breakthrough: “Attention Is All You Need” replaced recurrence with self-attention, unlocking massive parallelism and scalability.

Core Architecture

Layer Stack: Consists of alternating self-attention and feed-forward (MLP) layers, each wrapped in residual connections and layer normalization.
Positional Encodings: Since self-attention is permutation invariant, add sinusoidal or learned positional embeddings to inject sequence order.

Self-Attention Mechanism

Q, K, V Explained:
- Query (Q): The representation of the token seeking contextual info.
- Key (K): The representation of tokens being compared against.
- Value (V): The information to be aggregated based on the attention scores.
Multi-Head Attention: Splits Q, K, V into multiple “heads” to capture diverse relationships and nuances across different subspaces.
Dot-Product & Scaling: Computes similarity between Q and K (scaled to avoid large gradients), then applies softmax to weigh V accordingly.

Masking

Causal Masking: In autoregressive models, prevents a token from “seeing” future tokens, ensuring proper generation.
Padding Masks: Ignore padded (non-informative) parts of sequences to maintain meaningful attention distributions.

Feed-Forward Networks (MLPs)

Transformation & Storage: Post-attention MLPs apply non-linear transformations; many argue they’re where the “facts” or learned knowledge really get stored.
Depth & Expressivity: Their layered nature deepens the model’s capacity to represent complex patterns.

Residual Connections & Normalization

Residual Links: Crucial for gradient flow in deep architectures, preventing vanishing/exploding gradients.
Layer Normalization: Stabilizes training by normalizing across features, enhancing convergence.

Scalability & Efficiency Considerations

Parallelization Advantage: Entire architecture is designed to exploit modern parallel hardware, a huge win over RNNs.
Complexity Trade-offs: Self-attention’s quadratic complexity with sequence length remains a challenge; spurred innovations like sparse or linearized attention.

Training Paradigms & Emergent Properties

Pretraining & Fine-Tuning: Massive self-supervised pretraining on diverse data, followed by task-specific fine-tuning, is the norm.
Emergent Behavior: With scale comes abilities like in-context learning and few-shot adaptation, aspects that are still being unpacked.

Interpretability & Knowledge Distribution

Distributed Representation: “Facts” aren’t stored in a single layer but are embedded throughout both attention heads and MLP layers.
Debate on Attention: While some see attention weights as interpretable, a growing view is that real “knowledge” is diffused across the network’s parameters.

MLA 021 Databricks

22 juni 2022 | 26 min

Try a walking desk to stay healthy while you study or work!

Full notes at ocdevel.com/mlg/mla-21

Raybeam and Databricks: Ming Chang from Raybeam discusses Raybeam's focus on data science and analytics, and how their recent acquisition by Dept Agency has expanded their scope into ML Ops and AI. Raybeam often utilizes Databricks due to its comprehensive nature.
Understanding Databricks: Contrary to initial assumptions, Databricks is not just an analytics platform like Tableau but an ML Ops platform competing with tools like SageMaker and Kubeflow. It offers functionalities for creating notebooks, executing Python code, and using a hosted Spark cluster and Delta Lake for data storage.
Choosing the Right MLOps Tool: Depending on client requirements, Raybeam might recommend different tools. Decision factors include client's existing expertise, infrastructure needs, and scaling challenges. Databricks is often recommended for its ease of use and features.
Databricks Features: Offers a hosted solution for Spark clusters on AWS, Azure, or GCP; integrates with IDEs like VSCode through Databricks Connect; provides a unique Git integration for version control of notebooks; and utilizes Delta Lake for version control of Parquet files, enhancing operations like edit and delete.
Parquet and Delta Lake: Parquet files are optimized for big data, and Delta Lake provides transaction-like operations over Parquet by maintaining version history.
Pricing and Usage: Databricks adds a nominal fee on top of cloud provider charges. It's accessible for single developers and startups, making it suitable for various scales of operations.
Ming Chang's Picks: Discusses interests in automated stock trading projects and building drones with Raspberry Pi, highlighting the intersection of programming and physical computing.

Additional Resources

For a hands-on look at Ming Chang's drone project, follow his developments or connect for insights on building a Raspberry Pi-powered drone.

MLA 020 Kubeflow

29 januari 2022 | 68 min

MLA 019 DevOps

13 januari 2022 | 75 min

MLA 017 AWS Local Development

6 november 2021 | 64 min

MLA 016 SageMaker 2

5 november 2021 | 60 min

MLA 015 SageMaker 1

4 november 2021 | 47 min

MLA 014 Machine Learning Server

18 januari 2021 | 52 min

MLA 013 Customer Facing Tech Stack

3 januari 2021 | 47 min

MLA 012 Docker

9 november 2020 | 31 min

MLG 032 Cartesian Similarity Metrics

8 november 2020 | 42 min

MLA 011 Practical Clustering

8 november 2020 | 34 min

MLA 010 NLP packages: transformers, spaCy, Gensim, NLTK

28 oktober 2020 | 26 min

MLA 009 Charting tools

6 november 2018 | 24 min

MLA 008 Exploratory Data Analysis

26 oktober 2018 | 25 min

MLA 007 Jupyter Notebooks

16 oktober 2018 | 16 min

MLA 006 Salary

19 juli 2018 | 19 min

MLA 005 Shapes & Sizes

9 juni 2018 | 27 min

MLA 003 Storage: HDF, Pickle, Postgres

24 maj 2018 | 17 min

MLA 002 Numpy & Pandas

24 maj 2018 | 18 min

MLA 001 Certificates & Degrees

24 maj 2018 | 12 min

MLG 029 Reinforcement Learning Intro

5 februari 2018 | 43 min

MLG 028 Hyperparameters 2

4 februari 2018 | 51 min

Notes and resources: ocdevel.com/mlg/28

Try a walking desk to stay healthy while you study or work!

More hyperparameters for optimizing neural networks. A focus on regularization, optimizers, feature scaling, and hyperparameter search methods.

Hyperparameter Search Techniques

Grid Search involves testing all possible permutations of hyperparameters, but is computationally exhaustive and suited for simpler, less time-consuming models.
Random Search selects random combinations of hyperparameters, potentially saving time while potentially missing the optimal solution.
Bayesian Optimization employs machine learning to continuously update and hone in on efficient hyperparameter combinations, avoiding the exhaustive or random nature of grid and random searches.

Regularization in Neural Networks

L1 and L2 Regularization penalize certain parameter configurations to prevent model overfitting; often smoothing overfitted parameters.
Dropout randomly deactivates neurons during training to ensure the model doesn’t over-rely on specific neurons, fostering better generalization.

Optimizers

Optimizers like Adam, which combines elements of momentum and adaptive learning rates, are explained as vital tools for refining the learning process of neural networks.
Adam, being the most sophisticated and commonly used optimizer, improves upon simpler techniques like momentum by incorporating more advanced adaptative features.

Initializers

The importance of weight initialization is underscored with methods like uniform random initialization and the more advanced Xavier initialization to prevent neural networks from starting in 'stuck' states.

Feature Scaling

Different scaling methods such as standardization and normalization are used to scale feature inputs to small, standardized ranges.
Batch Normalization is highlighted, integrating scaling directly into the network to prevent issues like exploding and vanishing gradients through the normalization of layer outputs.

Links

MLG 027 Hyperparameters 1

28 januari 2018 | 47 min

Full notes and resources at ocdevel.com/mlg/27

Try a walking desk to stay healthy while you study or work!

Hyperparameters are crucial elements in the configuration of machine learning models. Unlike parameters, which are learned by the model during training, hyperparameters are set by humans before the learning process begins. They are the knobs and dials that humans can control to influence the training and performance of machine learning models.

Definition and Importance

Hyperparameters differ from parameters like theta in linear and logistic regression, which are learned weights. They are choices made by humans, such as the type of model, number of neurons in a layer, or the model architecture. These choices can have significant effects on the model's performance, making them vital to conscious and informed tuning.

Types of Hyperparameters Model Selection:

Choosing what model to use is itself a hyperparameter. For example, deciding between linear regression, logistic regression, naive Bayes, or neural networks.

Architecture of Neural Networks:

Number of Layers and Neurons: Deciding the width (number of neurons) and depth (number of layers).
Types of Layers: Whether to use LSTMs, convolutional layers, or dense layers.

Activation Functions:

They transform linear outputs into non-linear outputs. Popular choices include ReLU, tanh, and sigmoid, with ReLU being the default for most neural network layers.

Regularization and Optimization:

These influence the learning process. The use of L1/L2 regularization or dropout, as well as the type of optimizer (e.g., Adam, Adagrad), are hyperparameters.

Optimization Techniques

Techniques like grid search, random search, and Bayesian optimization are used to systematically explore combinations of hyperparameters to find the best configuration for a given task. While these methods can be computationally expensive, they are necessary for achieving optimal model performance.

Challenges and Future Directions

The field strives towards simplifying the choice of hyperparameters, ideally automating them to become parameters of the model itself. Efforts like Google's AutoML aim to handle hyperparameter tuning automatically.

Understanding and optimizing hyperparameters is a cornerstone in machine learning, directly impacting the effectiveness and efficiency of a model. Progress continues to integrate these choices into model training, reducing the dependency on human intervention and trial-and-error experimentation.

Decision Tree

Model selection
- Unsupervised? K-means Clustering => DL
- Linear? Linear regression, logistic regression
- Simple? Naive Bayes, Decision Tree (Random Forest, Gradient Boosting)
- Little data? Boosting
- Lots of data, complex situation? Deep learning
Network
- Layer arch
  - Vision? CNN
  - Time? LSTM
  - Other? MLP
  - Trading LSTM => CNN decision
- Layer size design (funnel, etc)
  - Face pics
  - From BTC episode
  - Don't know? Layers=1, Neurons=mean(inputs, output) link
Activations / nonlinearity
- Output
  - Sigmoid = predict probability of output, usually at output
  - Softmax = multi-class
  - Nothing = regression
- Relu family (Leaky Relu, Elu, Selu, ...) = vanishing gradient (gradient is constant), performance, usually better
- Tanh = classification between two classes, mean 0 important

MLG 026 Project Bitcoin Trader

27 januari 2018 | 39 min

Try a walking desk to stay healthy while you study or work!

Ful notes and resources at ocdevel.com/mlg/26

NOTE. This episode is no longer relevant, and tforce_btc_trader no longer maintained. The current podcast project is Gnothi.

Episode Overview

TForce BTC Trader

Project: Trading Crypto
- Special: Intuitively highlights decisions: hypers, supervised v reinforcement, LSTM v CNN
Crypto (v stock)
- Bitcoin, Ethereum, Litecoin, Ripple
- Many benefits (immutable permenant distributed ledger; security; low fees; international; etc)
- For our purposes: popular, volatile, singular
  - Singular like Forex vs Stock (instruments)
Trading basics
- Day, swing, investing
- Patterns (technical analysis, vs fundamentals)
- OHLCV / Candles
- Indicators
- Exchanges & Arbitrage (GDAX, Krakken)
Good because highlights lots
- LSTM v CNN
- Supervised v Reinforcement
- Obvious net architectures (indicators, time-series, tanh v relu)

Episode Summary

The project "Bitcoin Trader" involves developing a Bitcoin trading bot using machine learning to capitalize on the hot topic of cryptocurrency and its potential profitability. The project will serve as a medium to delve into complex machine learning engineering topics, such as hyperparameter selection and reinforcement learning, over subsequent episodes.

Cryptocurrency, specifically Bitcoin, is used for its universal and decentralized nature, akin to a digital, secure, and democratic financial instrument like the US dollar. Bitcoin mining involves running complex calculations to manage the currency's existence, similar to a distributed Federal Reserve system, with transactions recorded on a secure and permanent ledger known as the blockchain.

The flexibility of cryptocurrency trading allows for machine learning applications across unsupervised, supervised, and reinforcement learning paradigms. This project will focus on using models such as LSTM recurrent neural networks and convolutional neural networks, highlighting Bitcoin’s unique capacity to illustrate machine learning concept decisions like network architecture.

Trading differs from investing by focusing on profit from price fluctuations rather than a belief in long-term value increase. It involves understanding patterns in price actions to buy low and sell high. Different types of trading include day trading, which involves daily buying and selling, and swing trading, which spans longer periods.

Trading decisions rely on patterns identified in price graphs, using time series data. Data representation through candlesticks (OHLCV: open-high-low-close-volume), coupled with indicators like moving averages and RSI, provide multiple input features for machine learning models, enhancing prediction accuracy.

Exchanges like GDAX and Kraken serve as platforms for converting traditional currencies into cryptocurrencies. The efficient market hypothesis suggests that the value of an instrument is fairly priced based on the collective analysis of market participants. Differences in exchange prices can provide opportunities for arbitrage, further fueling trading strategies.

The project code, currently using deep reinforcement learning via tensor force, employs convolutional neural networks over LSTM to adapt to Bitcoin trading's intricacies. The project will be available at ocdevel.com for community engagement, with future episodes tackling hyperparameter selection and deep reinforcement learning techniques.

MLG 025 Convolutional Neural Networks

30 oktober 2017 | 45 min

MLG 024 Tech Stack

7 oktober 2017 | 62 min

Try a walking desk to stay healthy while you study or work!

Notes and resources at ocdevel.com/mlg/24

Hardware

Desktop if you're stationary, as you'll get the best performance bang-for-buck and improved longevity; laptop if you're mobile.

Desktops. Build your own PC, better value than pre-built. See PC Part Picker, make sure to use an Nvidia graphics card. Generally shoot for 2nd-best of CPUs/GPUs. Eg, RTX 4070 currently (2024-01); better value-to-price than 4080+.

For laptops, see this post (updated).

OS / Software

Use Linux (I prefer Ubuntu), or Windows, WSL2, and Docker. See mla/12 for details.

Programming Tech Stack

Deep-learning frameworks. You'll use both TF & PT eventually, so don't get hung up. mlg/9 for details.

Tensorflow (and/or Keras)
PyTorch (and/or Lightning)

Shallow-learning / utilities: ScikitLearn, Pandas, Numpy

Cloud-hosting: AWS / GCP / Azure. mla/13 for details.

Episode Summary

The episode discusses setting up a tech stack tailored for machine learning, emphasizing the necessity of choosing a primary programming language and framework, which, in this case, are Python and TensorFlow. The decision is supported by the ongoing popularity and community support for these tools. This preference is further influenced by the necessity for GPU optimization, which TensorFlow provides, allowing for enhanced performance through utilizing Nvidia's CUDA technology.

A notable change in the landscape is the decline of certain deep learning frameworks such as Theano, and the rise of competitors like PyTorch, which is gaining traction due to its ease of use in comparison to TensorFlow. The author emphasizes the importance of selecting frameworks with robust community support and resources, highlighting TensorFlow's lead in the market in this respect.

For hardware, the suggestion is a custom-built PC with a powerful Nvidia GPU, such as the 1080 TI, running Ubuntu Linux for best compatibility. However, for those who favor cloud services, Amazon Web Services (AWS) and Google Cloud Platform (GCP) are viable options, with a preference for GCP due to cost and performance benefits, particularly with the upcoming Tensor Processing Units (TPUs).

On the software side, the use of Pandas for data manipulation, NumPy for mathematical operations, and Scikit-Learn for shallow learning tasks provides a comprehensive toolkit for machine learning development. Additionally, the use of abstraction libraries such as Keras for simplifying TensorFlow syntax and TensorForce for reinforcement learning are recommended.

The episode further explores system architectures, suggesting a separation of concerns between a web app server and a machine learning (job) server. Communication between these components can be efficiently managed using a message queuing system like RabbitMQ, with Celery as a potential abstraction layer.

To support developers in implementing their machine learning pipelines, the recommendation extends to leveraging existing datasets, using Scikit-Learn for convenient access, and standardizing data for effective training results. The author points to several books and resources to assist in understanding and applying these technologies effectively, ending with your own workstation recommendations and building TensorFlow from source for performance gains as a potential advanced optimization step.

MLG 023 Deep NLP 2

20 augusti 2017 | 43 min

Try a walking desk to stay healthy while you study or work!

Notes and resources at ocdevel.com/mlg/23

Neural Network Types in NLP

Vanilla Neural Networks (Feedforward Networks):
- Used for general classification or regression tasks.
- Examples include predicting housing costs or classifying images as cat, dog, or tree.
Convolutional Neural Networks (CNNs):
- Primarily used for image-related tasks.
Recurrent Neural Networks (RNNs):
- Used for sequence-based tasks such as weather predictions, stock market predictions, and natural language processing.
- Differ from feedforward networks as they loop back onto previous steps to handle sequences over time.

Key Concepts and Applications

Supervised vs Reinforcement Learning:
- Supervised learning involves training models using labeled data to learn patterns and create labels autonomously.
- Reinforcement learning focuses on learning actions to maximize a reward function over time, suitable for tasks like gaming AI but less so for tasks like NLP.
Encoder-Decoder Models:
- These models process entire input sequences before producing output, crucial for tasks like machine translation, where full context is needed before output generation.
- Transforms sequences to a vector space (encoding) and reconstructs it to another sequence (decoding).
Gradient Problems & Solutions:
- Vanishing and Exploding Gradient Problems occur during training due to backpropagation over time steps, causing information loss or overflow, notably in longer sequences.
- Long Short-Term Memory (LSTM) Cells solve these by allowing RNNs to retain important information over longer time sequences, effectively mitigating gradient issues.

LSTM Functionality

An LSTM cell replaces traditional neurons in an RNN with complex machinery that regulates information flow.
Components within an LSTM cell:
- Forget Gate: Decides which information to discard from the cell state.
- Input Gate: Determines which information to update.
- Output Gate: Controls the output from the cell.

MLG 022 Deep NLP 1

29 juli 2017 | 50 min

Try a walking desk to stay healthy while you study or work!

Notes and resources at ocdevel.com/mlg/22

Deep NLP Fundamentals

Deep learning has had a profound impact on natural language processing by introducing models like recurrent neural networks (RNNs) that are specifically adept at handling sequential data. Unlike traditional linear models like linear regression, RNNs can address the complexities of language which appear from its inherent non-linearity and hierarchy. These models are able to learn complex features by combining data in multiple layers, which has revolutionized areas like sentiment analysis, machine translation, and more.

Neural Networks and Their Use in NLP

Neural networks can be categorized into regular feedforward neural networks and recurrent neural networks (RNNs). Feedforward networks are used for non-sequential tasks, while RNNs are useful for sequential data processing such as language, where the network’s hidden layers are connected to enable learning over time steps. This loopy architecture allows RNNs to maintain a form of state or memory, making them effective for tasks where context is crucial. The challenge of mapping these sequences into meaningful output has led to architectures like the encoder-decoder model, which reads entire sequences to produce responses or translations, enhancing the network's ability to learn and remember context across long sequences.

Word Embeddings and Contextual Representations

A key challenge in processing natural language using machine learning models is representing words as numbers, as machine learning relies on mathematical operations. Initial representations like one-hot vectors were simple but lacked semantic meaning. To address this, word embeddings such as those generated by the Word2Vec model have been developed. These embeddings place words in a vector space where distance and direction between vectors are meaningful, allowing models to interpret semantic similarities and differences between words. Word2Vec, using neural networks, learns these embeddings by predicting word contexts or vice versa.

Advanced Architectures and Practical Implications

RNNs and their more sophisticated versions like LSTM and GRU cells address specific challenges such as the vanishing gradient problem, which can occur during backpropagation through time. These architectures allow for more effective and longer-range dependencies to be learned, vital for handling the nuances of human language. As a result, these models have become dominant in modern NLP, replacing older methods for tasks ranging from part-of-speech tagging to machine translation.

Further Learning and Resources

For in-depth learning, resources such as the "Unreasonable Effectiveness of RNNs", Stanford courses on deep NLP by Christopher Manning, and continued education in deep learning can enhance one's understanding of these models. Emphasis on both theoretical understanding and practical application will be crucial for mastering the deep learning techniques that are transforming NLP.

MLG 020 Natural Language Processing 3

23 juli 2017 | 41 min

Try a walking desk to stay healthy while you study or work!

Notes and resources at ocdevel.com/mlg/20

NLP progresses through three main layers: text preprocessing, syntax tools, and high-level goals, each building upon the last to achieve complex linguistic tasks.

Text Preprocessing

Text preprocessing involves essential steps such as tokenization, stemming, and stop word removal. These foundational tasks clean and prepare text for further analysis, ensuring that subsequent processes can be applied more effectively.

Syntax Tools

Syntax tools are crucial for understanding grammatical structures within text. Part of Speech Tagging identifies the role of words within sentences, such as noun, verb, or adjective. Named Entity Recognition (NER) distinguishes entities such as people, organizations, and dates, leveraging models like maximum entropy, support vector machines, or hidden Markov models.

Achieving High-Level Goals

High-level NLP goals include text classification, sentiment analysis, and optimizing search engines. Techniques such as the Naive Bayes algorithm enable effective text classification by simplifying documents into word occurrence models. Search engines benefit from the TF-IDF method in tandem with cosine similarity, allowing for efficient document retrieval and relevance ranking.

In-depth Look at Syntax Parsing

Syntax parsing delves into sentence structure through two primary approaches: context-free grammars (CFG) and dependency parsing. CFGs use production rules to break down sentences into components like noun phrases and verb phrases. Probabilistic enhancements to CFGs learn from datasets like the Penn Treebank to determine the likelihood of various grammatical structures. Dependency parsing, on the other hand, maps out word relationships through directional arcs, providing a visual dependency tree that highlights connections between components such as subjects and verbs.

Applications of NLP Tools

Syntax parsing plays a vital role in tasks like relationship extraction, providing insights into how entities relate within text. Question answering integrates various tools, using TF-IDF and syntax parsing to locate and extract precise answers from relevant documents, evidenced in systems like Google’s snippet answers.

Text summarization seeks to distill large texts into concise summaries. By employing TF-IDF, the process identifies sentences rich in informational content due to their less frequent vocabulary, removing redundancies for a coherent summary. TextRank, a graph-based methodology, evaluates sentence importance based on their connectedness within a document.

Machine Translation Evolution

Machine translation demonstrates the transformative impact of deep learning. Traditional methods, characterized by their complexity and multiple models, have been surpassed by neural machine translation systems. These employ recurrent neural networks (RNNs) to achieve end-to-end translation, accommodating tasks traditionally dependent on separate linguistic models into a unified approach, thus simplifying development and improving accuracy.

The episode underscores the transition from shallow NLP approaches to deep learning methods, highlighting how advanced models, particularly those involving RNNs, are redefining speech processing tasks with efficiency and sophistication.

MLG 019 Natural Language Processing 2

11 juli 2017 | 66 min

Try a walking desk to stay healthy while you study or work!

Notes and resources at ocdevel.com/mlg/19

Classical NLP Techniques:

Origins and Phases in NLP History: Initially reliant on hardcoded linguistic rules, NLP's evolution significantly pivoted with the introduction of machine learning, particularly shallow learning algorithms, leading eventually to deep learning, which is the current standard.
Importance of Classical Methods: Knowing traditional methods is still valuable, providing a historical context and foundation for understanding NLP tasks. Traditional methods can be advantageous with small datasets or limited compute power.
Edit Distance and Stemming:
- Levenshtein Distance: Used for spelling corrections by measuring the minimal edits needed to transform one string into another.
- Stemming: Simplifying a word to its base form. The Porter Stemmer is a common algorithm used.
Language Models:
- Understand language legitimacy by calculating the joint probability of word sequences.
- Use n-grams for constructing language models to increase accuracy at the expense of computational power.
Naive Bayes for Classification:
- Ideal for tasks like spam detection, document classification, and sentiment analysis.
- Relies on a 'bag of words' model, simplifying documents down to word frequency counts and disregarding sequence dependence.
Part of Speech Tagging and Named Entity Recognition:
- Methods: Maximum entropy models, hidden Markov models.
- Challenges: Feature engineering for parts of speech, complexity in named entity recognition.
Generative vs. Discriminative Models:
- Generative Models: Estimate the joint probability distribution; useful with less data.
- Discriminative Models: Focus on decision boundaries between classes.
Topic Modeling with LDA:
- Latent Dirichlet Allocation (LDA) helps identify topics within large sets of documents by clustering words into topics, allowing for mixed membership of topics across documents.
Search and Similarity Measures:
- Utilize TF-IDF for transforming documents into vectors reflecting term importance inversely correlated with document frequency in the corpus.
- Employ cosine similarity for measuring semantic similarity between document vectors.

MLG 018 Natural Language Processing 1

26 juni 2017 | 58 min

Try a walking desk to stay healthy while you study or work!

Full notes at ocdevel.com/mlg/18

Overview: Natural Language Processing (NLP) is a subfield of machine learning that focuses on enabling computers to understand, interpret, and generate human language. It is a complex field that combines linguistics, computer science, and AI to process and analyze large amounts of natural language data.

NLP Structure

NLP is divided into three main tiers: parts, tasks, and goals.

1. Parts

Text Pre-processing:

Tokenization: Splitting text into words or tokens.
Stop Words Removal: Eliminating common words that may not contribute to the meaning.
Stemming and Lemmatization: Reducing words to their root form.
Edit Distance: Measuring how different two words are, used in spelling correction.

2. Tasks

Syntactic Analysis:

Part-of-Speech (POS) Tagging: Identifying the grammatical roles of words in a sentence.
Named Entity Recognition (NER): Identifying entities like names, dates, and locations.
Syntax Tree Parsing: Analyzing the sentence structure.
Relationship Extraction: Understanding relationships between entities in text.

3. Goals

High-Level Applications:

Spell Checking: Correcting spelling mistakes using edit distances and context.
Document Classification: Categorizing texts into predefined groups (e.g., spam detection).
Sentiment Analysis: Identifying emotions or sentiments from text.
Search Engine Functionality: Document relevance and similarity using algorithms like TF-IDF.
Natural Language Understanding (NLU): Deciphering the meaning and intent behind sentences.
Natural Language Generation (NLG): Creating text, including chatbots and automatic summarization.

NLP Evolution and Algorithms

Evolution:

Early Rule-Based Systems: Initially relied on hard-coded linguistic rules.
Machine Learning Integration: Transitioned to using algorithms that improved flexibility and accuracy.
Deep Learning: Utilizes neural networks like Recurrent Neural Networks (RNNs) for complex tasks such as machine translation and sentiment analysis.

Key Algorithms:

Naive Bayes: Used for classification tasks.
Hidden Markov Models (HMMs): Applied in POS tagging and speech recognition.
Recurrent Neural Networks (RNNs): Effective for sequential data in tasks like language modeling and machine translation.

Career and Market Relevance

NLP offers robust career prospects as companies strive to implement technologies like chatbots, virtual assistants (e.g., Siri, Google Assistant), and personalized search experiences. It's integral to market leaders like Google, which relies on NLP for applications from search result ranking to understanding spoken queries.

Resources for Learning NLP

Books:
- "Speech and Language Processing" by Daniel Jurafsky and James Martin: A comprehensive textbook covering theoretical and practical aspects of NLP.
Online Courses:
- Stanford's NLP YouTube Series by Daniel Jurafsky: Offers practical insights complementing the book.
Tools and Libraries:
- NLTK (Natural Language Toolkit): A Python library for text processing, providing functionalities for tokenizing, parsing, and applying algorithms like Naive Bayes.
- Alternatives: OpenNLP, Stanford NLP, useful for specific shallow learning tasks, leading into deep learning frameworks like TensorFlow and PyTorch.

NLP continues to evolve with applications expanding across AI, requiring collaboration with fields like speech processing and image recognition for tasks like OCR and contextual text understanding.

MLG 017 Checkpoint

4 juni 2017 | 8 min

MLG 016 Consciousness

21 maj 2017 | 74 min

MLG 015 Performance

7 maj 2017 | 43 min

MLG 014 Shallow Algos 3

23 april 2017 | 48 min

MLG 013 Shallow Algos 2

9 april 2017 | 56 min

MLG 012 Shallow Algos 1

19 mars 2017 | 54 min

MLG 010 Languages & Frameworks

7 mars 2017 | 45 min

Try a walking desk to stay healthy while you study or work!

Full notes at ocdevel.com/mlg/10

Topics:

Recommended Languages and Frameworks:
- Python and TensorFlow are top recommendations for machine learning.
- Python's versatile libraries (NumPy, Pandas, Scikit-Learn) enable it to cover all areas of data science including data mining, analytics, and machine learning.
Language Choices:
- C/C++: High performance, suitable for GPU optimization but not recommended unless already familiar.
- Math Languages (R, MATLAB, Octave, Julia): Optimized for mathematical operations, particularly R preferred for data analytics.
- JVM Languages (Java, Scala): Suited for scalable data pipelines (Hadoop, Spark).
Framework Details:
- TensorFlow: Comprehensive tool supporting a wide range of ML tasks; notably improves Python’s performance.
- Theano: First in symbolic graph framework, but losing popularity compared to newer frameworks.
- Torch: Initially favored for image recognition, now supports a Python API.
- Keras: High-level API running on top of TensorFlow or Theano for easier neural network construction.
- Scikit-learn: Good for shallow learning algorithms.

Comparisons:

C++ vs Python in ML: C++ offers direct GPU access for performance, but Python streamlined performance with frameworks that auto-generate optimized C code.
R and Python in Data Analytics: Python’s Pandas and NumPy rival R with a strong general-purpose application beyond analytics.

Considerations:

Python’s Ecosystem Benefits: Single programming ecosystem spans full data science workflow, crucial for integrated projects.
Emerging Trends: Keep an eye on Julia for future considerations in math-heavy operations and industry adoption.

Additional Notes:

Hardware Recommendations:
- Utilize Nvidia GPUs for machine learning due to superior support and integration with CUDA and cuDNN.
Learning Resources:
- TensorFlow's documentation and tutorials are highly recommended for learning due to their thoroughness and regular updates.
- Suggested learning order: Learn Python fundamentals, then proceed to TensorFlow.

Links

Other languages like Node, Go, Rust: why not to use them.
Best Programming Language for Machine Learning
Data Science Job Report 2017
An Overview of Python Deep Learning Frameworks
Evaluation of Deep Learning Toolkits
Comparing Frameworks: Deeplearning4j, Torch, Theano, TensorFlow, Caffe, Paddle, MxNet, Keras & CNTK - grain of salt, it's super heavy DL4J propaganda (written by them)

MLG 009 Deep Learning

4 mars 2017 | 51 min

Try a walking desk to stay healthy while you study or work!

Full notes at ocdevel.com/mlg/9

Key Concepts:

Deep Learning vs. Shallow Learning: Machine learning is broken down hierarchically into AI, ML, and subfields like supervised/unsupervised learning. Deep learning is a specialized area within supervised learning distinct from shallow learning algorithms like linear regression.
Neural Networks: Central to deep learning, artificial neural networks include models like multilayer perceptrons (MLPs), convolutional neural networks (CNNs), and recurrent neural networks (RNNs). Neural networks are composed of interconnected units or "neurons," which are mathematical representations inspired by biological neurons.

Unique Features of Neural Networks:

Feature Learning: Neural networks learn to combine input features optimally, enabling them to address complex non-linear problems where traditional algorithms fall short.
Hierarchical Representation: Data can be processed hierarchically through multiple layers, breaking down inputs into simpler components that can be reassembled to solve complex tasks.

Applications:

Medical Cost Estimation: Neural networks can handle non-linear complexities such as feature interactions, e.g., age, smoking, obesity, impacting medical costs.
Image Recognition: Neural networks leverage hierarchical data processing to discern patterns such as lines and edges, building up to recognizing complex structures like human faces.

Computational Considerations:

Cost of Deep Learning: Deep learning's computational requirements make it expensive and resource-intensive compared to shallow learning algorithms. It's cost-effective to use when necessary for complex tasks but not for simpler linear problems.

Architectures & Optimization:

Different Architectures for Different Tasks: Specialized neural networks like CNNs are suited for image tasks, RNNs for sequence data, and DQNs for planning.
Neuron Types: Neurons in neural networks are referred to as activation functions (e.g., logistic sigmoid, relu) and differ based on tasks and architecture needs.

MLG 008 Math

23 februari 2017 | 28 min

MLG 007 Logistic Regression

19 februari 2017 | 35 min

MLG 006 Certificates & Degrees

17 februari 2017 | 16 min

MLG 005 Linear Regression

16 februari 2017 | 34 min

MLG 004 Algorithms - Intuition

12 februari 2017 | 23 min

Try a walking desk to stay healthy while you study or work!

Show notes at ocdevel.com/mlg/4

The AI Hierarchy

Artificial Intelligence is divided into subfields such as reasoning, planning, and learning.
Machine Learning is the learning subfield of AI.
Machine learning consists of three phases:
1. Predict (Infer)
2. Error (Loss)
3. Train (Learn)

Core Intuition

An algorithm makes a prediction.
An error function evaluates how wrong the prediction was.
The model adjusts its internal weights (training) to improve.

Example: House Price Prediction

Input: Spreadsheet with features like bedrooms, bathrooms, square footage, distance to downtown.
Output: Predicted price.
The algorithm iterates over data, learns patterns, and creates a model.
A model = algorithm + learned weights.
Features = individual columns used for prediction.
Weights = coefficients applied to each feature.
The process mimics algebra: rows = equations, entire spreadsheet = matrix.
Training adjusts weights to minimize error.

Feature Types

Numerical: e.g., number of bedrooms.
Nominal (Categorical): e.g., yes/no for downtown location.
Feature engineering can involve transforming raw inputs into more usable formats.

Linear Algebra Connection

Machine learning uses linear algebra to process data matrices.
Each row is an equation; training solves for best-fit weights across the matrix.

Categories of Machine Learning 1. Supervised Learning

Algorithm is explicitly trained with labeled data (e.g., price of a house).
Examples:
- Regression (predicting a number): linear regression
- Classification (predicting a label): logistic regression

2. Unsupervised Learning

No labels are given; the algorithm finds structure in the data.
Common task: clustering (e.g., user segmentation for ads).
Learns patterns without predefined classes.

3. Reinforcement Learning

Agent takes actions in an environment to maximize cumulative reward.
Example: mouse in a maze trying to find cheese.
Includes rewards (+points for cheese) and penalties (–points for failure or time).
Learns policies for optimal behavior.
Algorithms: Deep Q-Networks, policy optimization.
Used in games, robotics, and real-time decision systems.

Terminology Recap

Algorithm: Code that defines a learning strategy (e.g., linear regression).
Model: Algorithm + learned weights (trained state).
Features: Input variables (columns).
Weights: Coefficients learned for each feature.
Matrix: Tabular representation of input data.

Learning Path and Structure

Machine learning is a subfield of AI.
Machine learning itself splits into:
- Supervised Learning
- Unsupervised Learning
- Reinforcement Learning
Each category includes multiple algorithms.

Resources

MachineLearningMastery.com: Accessible articles on ML basics.
The Master Algorithm by Pedro Domingos: Introductory audio-accessible book on ML.
Podcast’s own curated learning paths: ocdevel.com/mlg/resources

MLG 003 Inspiration

10 februari 2017 | 19 min

Try a walking desk to stay healthy while you study or work!

Show notes at ocdevel.com/mlg/3.

This episode covers four major philosophical topics related to artificial intelligence. The purpose is to give broader context to why AI matters, before moving into technical details in later episodes.

1. Economic Automation

AI is automating not just simple tasks like data entry or tax prep, but also high-skill jobs such as medical diagnostics, surgery, and creative work like design, music, and art. There are two common reactions:

Fear: Concern over job displacement, similar to past economic shifts like the agricultural and industrial revolutions.
Is your job safe?
Optimism: Automation may lead to more comfortable living conditions and economic structures like Universal Basic Income. New job types could emerge, as they have in past transitions.

2. The Singularity

The singularity refers to a point of runaway technological growth, where AI becomes capable of improving itself recursively. This concept is tied to "artificial general intelligence" and "seed AI"—systems that not only perform tasks but create better versions of themselves. The idea is that this could trigger extremely rapid change, possibly representing a new phase of evolution beyond humanity.

3. Consciousness

I explore whether consciousness can emerge from machines. Since the brain is a physical machine and consciousness arises from it, it's possible that artificial systems could develop similar properties. Related ideas:

Qualia: Subjective experiences.
Functionalism: If something behaves like it’s conscious, it may be conscious.
Turing Test: If a machine is indistinguishable from a human in conversation, it passes the test.

4. Misaligned Goals and Risk

I discuss scenarios where AI causes harm not through malevolence but through poorly defined objectives. One example is the "paperclip maximizer" thought experiment, where an AI tasked with maximizing paperclip production might consume all resources to do so. This has led some public figures to raise concerns about AI safety. I don't share the same level of concern, but the topic is worth being aware of.

References

Ray Kurzweil, The Singularity is Near
Ray Kurzweil, How to Create a Mind
Daniel Dennett, Consciousness Explained
Nick Bostrom, Superintelligence
The Great Courses, Philosophy of Mind, Brain, Consciousness, and Thinking Machines

In the next episode, I begin covering the technical foundations of machine learning, starting with supervised, unsupervised, and reinforcement learning.

MLG 002 What is AI, ML, DS

9 februari 2017 | 65 min

Links:

Notes and resources at ocdevel.com/mlg/2
Try a walking desk stay healthy & sharp while you learn & code
Try Descript audio/video editing with AI power-tools

What is artificial intelligence, machine learning, and data science? What are their differences? AI history.

Hierarchical breakdown: DS(AI(ML)). Data science: any profession dealing with data (including AI & ML). Artificial intelligence is simulated intellectual tasks. Machine Learning is algorithms trained on data to learn patterns to make predictions.

Artificial Intelligence (AI) - Wikipedia

Oxford Languages: the theory and development of computer systems able to perform tasks that normally require human intelligence, such as visual perception, speech recognition, decision-making, and translation between languages.

AlphaGo Movie, very good!

Sub-disciplines

Reasoning, problem solving
Knowledge representation
Planning
Learning
Natural language processing
Perception
Motion and manipulation
Social intelligence
General intelligence

Applications

Autonomous vehicles (drones, self-driving cars)
Medical diagnosis
Creating art (such as poetry)
Proving mathematical theorems
Playing games (such as Chess or Go)
Search engines
Online assistants (such as Siri)
Image recognition in photographs
Spam filtering
Prediction of judicial decisions
Targeting online advertisements

Machine Learning (ML) - Wikipedia

Oxford Languages: the use and development of computer systems that are able to learn and adapt without following explicit instructions, by using algorithms and statistical models to analyze and draw inferences from patterns in data.

Data Science (DS) - Wikipedia

Wikipedia: Data science is an interdisciplinary field that uses scientific methods, processes, algorithms and systems to extract knowledge and insights from noisy, structured and unstructured data, and apply knowledge and actionable insights from data across a broad range of application domains. Data science is related to data mining, machine learning and big data.

History

Greek mythology, Golums
First attempt: Ramon Lull, 13th century
Davinci's walking animals
Descartes, Leibniz
1700s-1800s: Statistics & Mathematical decision making
- Thomas Bayes: reasoning about the probability of events
- George Boole: logical reasoning / binary algebra
- Gottlob Frege: Propositional logic
1832: Charles Babbage & Ada Byron / Lovelace: designed Analytical Engine (1832), programmable mechanical calculating machines
1936: Universal Turing Machine
- Computing Machinery and Intelligence - explored AI!
1946: John von Neumann Universal Computing Machine
1943: Warren McCulloch & Walter Pitts: cogsci rep of neuron; Frank Rosemblatt uses to create Perceptron (-> neural networks by way of MLP)
50s-70s: "AI" coined @Dartmouth workshop 1956 - goal to simulate all aspects of intelligence. John McCarthy, Marvin Minksy, Arthur Samuel, Oliver Selfridge, Ray Solomonoff, Allen Newell, Herbert Simon
- Newell & Simon: Hueristics -> Logic Theories, General Problem Solver
- Slefridge: Computer Vision
- NLP
- Stanford Research Institute: Shakey
- Feigenbaum: Expert systems
- GOFAI / symbolism: operations research / management science; logic-based; knowledge-based / expert systems
70s: Lighthill report (James Lighthill), big promises -> AI Winter
90s: Data, Computation, Practical Application -> AI back (90s)
- Connectionism optimizations: Geoffrey Hinton: 2006, optimized back propagation
Bloomberg, 2015 was whopper for AI in industry
AlphaGo & DeepMind

MLG 001 Introduction

1 februari 2017 | 8 min

Machine Learning Guide

Machine learning audio course, teaching the fundamentals of machine learning and artificial intelligence.

Om podden

Avsnitt

MLA 024 Code AI MCP Servers, ML Engineering

MLA 023 Code AI Models & Modes

MLA 022 Code AI Tools

MLG 033 Transformers

MLA 021 Databricks

MLA 020 Kubeflow

MLA 019 DevOps

MLA 017 AWS Local Development

MLA 016 SageMaker 2

MLA 015 SageMaker 1

MLA 014 Machine Learning Server

MLA 013 Customer Facing Tech Stack

MLA 012 Docker

MLG 032 Cartesian Similarity Metrics

MLA 011 Practical Clustering

MLA 010 NLP packages: transformers, spaCy, Gensim, NLTK

MLA 009 Charting tools

MLA 008 Exploratory Data Analysis

MLA 007 Jupyter Notebooks

MLA 006 Salary

MLA 005 Shapes & Sizes

MLA 003 Storage: HDF, Pickle, Postgres

MLA 002 Numpy & Pandas

MLA 001 Certificates & Degrees

MLG 029 Reinforcement Learning Intro

MLG 028 Hyperparameters 2

MLG 027 Hyperparameters 1

MLG 026 Project Bitcoin Trader

MLG 025 Convolutional Neural Networks

MLG 024 Tech Stack

MLG 023 Deep NLP 2

MLG 022 Deep NLP 1

MLG 020 Natural Language Processing 3

MLG 019 Natural Language Processing 2

MLG 018 Natural Language Processing 1

MLG 017 Checkpoint

MLG 016 Consciousness

MLG 015 Performance

MLG 014 Shallow Algos 3

MLG 013 Shallow Algos 2

MLG 012 Shallow Algos 1

MLG 010 Languages & Frameworks

MLG 009 Deep Learning

MLG 008 Math

MLG 007 Logistic Regression

MLG 006 Certificates & Degrees

MLG 005 Linear Regression

MLG 004 Algorithms - Intuition

MLG 003 Inspiration

MLG 002 What is AI, ML, DS

MLG 001 Introduction