Podd: PyTorch Developer Podcast

Compiler collectives

4 augusti 2024 | 17 min

TORCH_TRACE and tlparse

29 april 2024 | 15 min

Higher order operators

21 april 2024 | 17 min

Inductor - Post-grad FX passes

12 april 2024 | 24 min

CUDA graph trees

24 mars 2024 | 21 min

Min-cut partitioner

17 mars 2024 | 16 min

AOTInductor

2 mars 2024 | 18 min

Tensor subclasses and PT2

24 februari 2024 | 13 min

Compiled autograd

19 februari 2024 | 18 min

PT2 extension points

5 februari 2024 | 16 min

Inductor - Define-by-run IR

24 januari 2024 | 12 min

Unsigned integers

17 januari 2024 | 13 min

Inductor - IR

16 januari 2024 | 18 min

Dynamo - VariableTracker

12 januari 2024 | 16 min

Unbacked SymInts

21 februari 2023 | 22 min

Zero-one specialization

20 februari 2023 | 21 min

torchdynamo

6 december 2022 | 26 min

PyTorch 2.0

4 december 2022 | 18 min

History of functorch

7 november 2022 | 19 min

Learning rate schedulers

13 juni 2022 | 20 min

Weak references

6 juni 2022 | 17 min

Strides

30 maj 2022 | 21 min

AOTAutograd

9 maj 2022 | 19 min

Dispatcher questions with Sherlock

2 maj 2022 | 19 min

New CI

25 april 2022 | 16 min

Python exceptions

17 april 2022 | 15 min

Torch vs ATen APIs

11 april 2022 | 15 min

All about NVIDIA GPUs

24 september 2021 | 19 min

Tensor subclasses and Liskov substitution principle

16 september 2021 | 19 min

Half precision

10 september 2021 | 18 min

DataLoader with multiple workers leaks memory

1 september 2021 | 17 min

Batching

18 augusti 2021 | 14 min

Multiple dispatch in __torch_function__

10 augusti 2021 | 14 min

Multithreading

3 augusti 2021 | 19 min

Asynchronous versus synchronous execution

27 juli 2021 | 15 min

gradcheck

23 juli 2021 | 17 min

torch.use_deterministic_algorithms

21 juli 2021 | 11 min

Reference counting

20 juli 2021 | 15 min

Memory layout

13 juli 2021 | 16 min

pytorch-probot

12 juli 2021 | 13 min

API design via lexical and dynamic scoping

9 juli 2021 | 22 min

Intro to distributed

8 juli 2021 | 16 min

Double backwards

7 juli 2021 | 17 min

Functional modules

6 juli 2021 | 15 min

CUDA graphs

28 juni 2021 | 14 min

Default arguments

25 juni 2021 | 15 min

Anatomy of a domain library

24 juni 2021 | 16 min

TensorAccessor

23 juni 2021 | 12 min

Random number generators

22 juni 2021 | 14 min

vmap

21 juni 2021 | 18 min

Expect tests

18 juni 2021 | 13 min

XLA

17 juni 2021 | 16 min

TH

16 juni 2021 | 11 min

TorchScript

15 juni 2021 | 20 min

CMake

14 juni 2021 | 18 min

torchdeploy

11 juni 2021 | 14 min

C++ frontend

10 juni 2021 | 17 min

PyObject preservation

9 juni 2021 | 16 min

Mobile selective build

8 juni 2021 | 16 min

What is mobile selective build? Why are we so obsessed with reducing binary size? How does selective build work? Why doesn't static linking just work? Why can't you just read out the ops used in a TorchScript model to determine what operators you actually need? What are the tradeoffs of statically determining the operator dependency graph versus tracing? What's up with the SELECTIVE_NAME macro? How the heck does selective build work at all when you have multiple mobile apps in a single Buck build system? What takeaways should I have as a regular PyTorch developer?

Further reading:

Official open source mobile documentation on custom selective builds https://pytorch.org/mobile/android/#custom-build
How to rebuild the op dependency yaml https://github.com/pytorch/pytorch/blob/master/tools/code_analyzer/build.sh

Liner notes:

binary size is premium; ship only what you actually need
big idea:
- get the ops your model needs -> apply this to build of pytorch
get the ops your model needs
- TorchScript ~> read it out directly from the model itself
- but what if ops use other ops?
  - need a dependency graph. done with static analysis llvm (jiakai) ~> with a (possibly inaccurate) yaml checked in for easy kickstart if you don't want to run the pass (updated by bot, not operational since Feb, recommend rebuilding from scratch if you run into trouble)
- other possibility: dynamic tracing
  - pro: no need for dependency graph, just look at what was called; works for dtypes
  - con: need representative inputs, if control flow might not cover everything
apply this to build of pytorch
- ordinarily: static linking ensures stuff that isn't used gets pruned
  - but this doesn't work with distributed operator registration based on static initializers
- how?
  - codegen - just don't generate it
  - no codegen - SELECTIVE_NAME - C++ doesn't support string in macro
- build system integration
  - buck constraint: only one library
    - therefore: generate multiple copies of glue library
  - alt: atomize library into each operator. caffe2 used to do this; each library takes a long time to build (1m) and crashes xcode because there's too many
common hiccups
- modify implementation details, some op is/isn't called anymore ~> error! usually just means some yaml needs regenerating. PyTorch Edge developers are very friendly and can help

torch.nn

7 juni 2021 | 14 min

Code generation

4 juni 2021 | 17 min

Why is autograd so complicated

3 juni 2021 | 16 min

__torch_function__

2 juni 2021 | 17 min

TensorIterator

1 juni 2021 | 18 min

native_functions.yaml

28 maj 2021 | 16 min

Serialization

27 maj 2021 | 17 min

Continuous integration

26 maj 2021 | 17 min

Stacked diffs and ghstack

25 maj 2021 | 12 min

Shared memory

24 maj 2021 | 11 min

Automatic mixed precision

21 maj 2021 | 14 min

Conjugate views

20 maj 2021 | 15 min

History and constraints of Tensor

19 maj 2021 | 15 min

How new operators are authored

18 maj 2021 | 16 min

The life and death of Variable

17 maj 2021 | 15 min

Backend extensibility

14 maj 2021 | 15 min

The road to structured kernels

13 maj 2021 | 17 min

Functionalization

12 maj 2021 | 14 min

Just enough CUDA to be dangerous

11 maj 2021 | 17 min

Inference mode

10 maj 2021 | 14 min

Vectorization

7 maj 2021 | 15 min

Dynamic library structure

6 maj 2021 | 15 min

History and constraints of the dispatcher

5 maj 2021 | 18 min

Binding C++ objects to Python

4 maj 2021 | 13 min

PyTorch Developer Podcast

The PyTorch Developer Podcast is a place for the PyTorch dev team to do bite sized (10-20 min) topics about all sorts of internal development topics in PyTorch.

Om podden

Avsnitt