Jordan Hendricks joined Bryan and Adam to talk about her work virtualizing time--particularly challenging when migrating virtual machines from one physical machine to another!
We've been hosting a live show weekly on Mondays at 5p for about an hour, and recording them all; here is the recording from June 12th, 2023.
In addition to Bryan Cantrill and Adam Leventhal, we were joined by Oxide colleague Jordan Hendricks.
The (lightly edited) live chat from the show:
- DanCrossNYC: The TSC ticks at a fixed rate now days, regardless of voltage scaling on the CPU.
- jbk: just x86 doesn't provide a consistent want to determine what the rate is
- jbk: (I guess some chips will tell you via CPUID, but I've yet to actually encounter such chips)
- jbk: some hypervisors will tell you via an MSR
- zorg24: Looks the Linux kernel docs have some documentation on the x86 TSC and PIT https://www.kernel.org/doc/html/next/virt/kvm/x86/timekeeping.html
- DanCrossNYC: CPUID or an MSR, but yeah, most systems sample over a fixed interval (determined by another time source) to figure it out.
- jbk: no, versus some other present component that allows you to measure the frequency
- DanCrossNYC: No, the PIT or HPET or something.
- jbk: https://src.illumos.org/source/xref/illumos-gate/usr/src/uts/i86pc/os/tscc_pit.c?r=236cb9a8
- jbk: is how it uses the PIT
- jbk: (the HPET code needs to improve it's accuracy, so it's only used when the PIT isn't there at the moment)
- jbk: some Intel NUCs have no PIT
- jbk: so HPET is the only option
- bcantrill: https://github.com/illumos/illumos-gate/commit/717646f7112314de3f464bc0b75f034f009c861e
- DanCrossNYC: Two big ones: system maintenance without disturbing guest workloads, and also load balancing across a rack.
- "Sevan: ah, thanks.
- https://github.com/illumos/illumos-gate/blob/717646f7112314de3f464bc0b75f034f009c861e/usr/src/test/bhyve-tests/tests/common/common.c#L166"
- bcantrill: https://github.com/oxidecomputer/tsc-simulator/tree/master
- DanCrossNYC: The guest may well be running NTP itself.
- iangrunert: I assume you could also check that NTP is alive / has synced recently before doing a migration right?
- aka_pugs: Do people use IEEE 1588/PTP in datacenters? Maybe finance wackos?
- zorg24: also it might be tricky to check if NTP synced recently if it is happening in usermode
- iangrunert: Might've missed this - is it just the hypervisor that has to run NTP recently or the VM as well?
- saone: I believe it was just the hypervisor
- DanCrossNYC: The host.
- DanCrossNYC: A guest may or may not; that's up to the guest.
- jbk: but IIUC, if the guest IS running NTP, then the host definitely needs it to avoid any time warps
- DanCrossNYC: Yup.
- DanCrossNYC: Fortunately, there's a bit of an out for the blackout window during migration: SMM mode can effectively pause a machine for an indefinite period of time.
- DanCrossNYC: We don't USE SMM anywhere, but robust systems software kinda needs to handle the case where the machine goes out to lunch for a minute.
- zorg24: 🙌 hooray for hardware with no SMM use
- DanCrossNYC: We have done everything we can to turn it off.
- ahl: https://github.com/dtolnay/case-studies/blob/master/autoref-specialization/README.md
- ahl: https://github.com/oxidecomputer/propolis
- earltea: it worked so well I almost thought the VM didn't migrate 😅
- saone: It's easy to forget that there's a world outside the cloud, but edge deployments that have physical peripherals hooked up need to maintain those connections to peripherals; migrating those peripherals to cloud environments and managing that integration has been a big challenge for my group.
- iangrunert: https://signalsandthreads.com/clock-synchronization/ Good listen about clock synchronization and PTP in the ""finance weirdos"" world. MiFID 2 time sync requirements require timestamping key trading event records to within 100 microseconds of UTC.
- jhendricks: a bit belated, but the propolis side of these changes: https://github.com/oxidecomputer/propolis/commit/7ed480843d3b5cfd9fd07dce41772f8eac4e9171
- saethlin: The calvalry??
- saethlin: Are we just going to let that slide
- saethlin: Is this a pronunciation situation again
- zorg24: not the first time I've heard it pronounced that way 🤷
- saethlin: Well maybe it's me learning this time
- DanCrossNYC: Calvary
- DanCrossNYC: That's the religious thing.
- ahl: https://github.com/illumos/illumos-gate/blob/0c5967db436935325af441af2b27d337f4e64af5/usr/src/uts/common/os/cyclic.c#L44
- zooooooooo: thought this was rust typescript at first 😳
- DanCrossNYC: Dunno... I missed it. 🙂
- ahl:
* Starting in about 1994, chip architectures began specifying high resolution
* timestamp registers. As of this writing (1999), all major chip families
* (UltraSPARC, PentiumPro, MIPS, PowerPC, Alpha) have high resolution
* timestamp registers, and two (UltraSPARC and MIPS) have added the capacity
* to interrupt based on timestamp values. These timestamp-compare registers
* present a time-based interrupt source which can be reprogrammed arbitrarily
* often without introducing error. Given the low cost of implementing such a
* timestamp-compare register (and the tangible benefit of eliminating
* discrete timer parts), it is reasonable to expect that future chip
* architectures will adopt this feature.
- aka_pugs: Bryan's TSC is overflowing.
- DanCrossNYC: That's Tom.
- DanCrossNYC: Riding in with the cavalry.
- aka_pugs: Good session.
- ahl: Thanks...