Oxide and Friends Twitter Space: May 10, 2021
A Requiem for SPARC with Tom Lyon
We’ve been holding a Twitter Space weekly on Mondays at 5p for about an hour. In addition to [@bcantrill](https://twitter.com/bcantrill) and [@ahl](https://twitter.com/ahl), speakers included special guest Tom Lyon plus Joshua Clulow, Dan McDonald, Dan Cross, Tom Killalea, Theo Schlossnagle, Antranig Vartanian, and [@perlhack](https://twitter.com/perlhack).
We recorded the space; the recording is here.
Some of the topics we hit on, in the order that we hit them:
- [@2:06](https://youtu.be/79NNXn5Kr90?t=126) SPARC 30th anniversary dinner > SPARC was an amazing achievement for its time, > but there were some nasty trade-offs made.
- [@2:56](https://youtu.be/79NNXn5Kr90?t=176) illumos announcement on the end of SPARC support
- [@4:37](https://youtu.be/79NNXn5Kr90?t=277) “There is no photography allowed in the bring-up lab” story
- [@6:23](https://youtu.be/79NNXn5Kr90?t=383) UltraSPARC-II E-cache parity error
- [@8:51](https://youtu.be/79NNXn5Kr90?t=531) Register windows > Most people don’t know, about that first SPARC, > there was no integer multiply or divide..
> It would trap on the instructions. - I feel so decadent, I’ve just been sprinkling multiplications around my code for years.
- [@9:55](https://youtu.be/79NNXn5Kr90?t=595) popc instruction (also called Hamming Weight)
- IBM Stretch 1961, and the one-of-a-kind IBM Harvest made for the NSA
- Henry Warren’s 2002 Hacker’s Delight Ch. 5 shows a ~20 instruction algorithm (no branches, only adds/shifts/masks by constants) > Warren: According to computer folklore, the population count function is important to the > National Security Agency. No one (outside of NSA) seems to know just what they use it for, > but it may be in cryptography work or in searching huge amounts of material.
- According to Agner Fog, Ice Lake performs popcnt with a 3 cycle latency, and Zen 3 with just 1 cycle latency.
- Phil Bagwell’s 2001 Ideal Hash Trees depend on pop count > Bagwell: Note that the performance of the algorithm is seriously impacted > by the poor execution speed of the POPCT emulation in Java, a problem > the Java designers may wish to address.
- Persistent versions of Bagwell’s trees are used for the built-in hash maps of Clojure, and in libraries for Scala etc.
- [@11:39](https://youtu.be/79NNXn5Kr90?t=699) This was the debate between Roger Faulkner and Jeff Bonwick: register windows
- [@12:35](https://youtu.be/79NNXn5Kr90?t=755) Register fishing: Bryan’s version and Adam’s version > When you want to know the state of some other process, you have to flush > those register windows to memory to be able to recover the stack trace.
- [@14:30](https://youtu.be/79NNXn5Kr90?t=870) Delay slot > We sat around the lunch table talking about how crazy it would > be to have a branch that executed right after a branch.
- DCTI couple (delayed control transfer instruction)
- [@15:31](https://youtu.be/79NNXn5Kr90?t=931) “Well, the instruction set doesn’t allow that..” story > Bedlam. As far as Solaris kernel discussions go, bedlam.
- Leibniz vs. Newton
- [@20:14](https://youtu.be/79NNXn5Kr90?t=1214) Annulled branches
- [@22:17](https://youtu.be/79NNXn5Kr90?t=1337) Praise for SPARC
- SPARC address space identifiers > When we were porting Solaris to x86, and deciding what fraction of the > address space would belong to the kernel vs the user, it felt disgusting to me.
- [@25:26](https://youtu.be/79NNXn5Kr90?t=1526) Software-filled TLB > They just didn’t have the room to cram a hardware page table walk into the chip.
- MIPS would give you a trap on a VAC conflict (virtual address cache)
- [@27:34](https://youtu.be/79NNXn5Kr90?t=1654) It was slow, it was late, and it had a lot of problems, it was wrong.
- UltraSPARC-III, code-named “Cheetah” > It’s weird, I compile this thing over and over, and every 80th time when > I compile and run it, it’s 40x slower..
- UltraSPARC-IV+, code-named “Panther”
- [@32:17](https://youtu.be/79NNXn5Kr90?t=1937) Does the Viking I-cache bug ring a bell?
- SuperSPARC, code-named “Viking” > You’d have to DC balance the I-cache. If you had too many zeros, > they’d start flipping to ones.
- E-cache parity error > It was due to everything but high energy particle strikes.
- Radioactive boron in our SRAM manufacturing process
- [@38:52](https://youtu.be/79NNXn5Kr90?t=2332) “Move it further from the tube” story > When you’re going to have a customer do something, you have to remember there’s > a human being on the other end of that. You cannot have them chasing your theories. > You need to be transparent and honest with them.
- [@42:25](https://youtu.be/79NNXn5Kr90?t=2545) Micron DRAM story
- [@44:38](https://youtu.be/79NNXn5Kr90?t=2678) High priced consultants and cosmic rays > They literally lined the roof with lead.. and it didn’t change the error rat...