> Why the duplication? I have not yet observed Metal using different programs for each.

I'm guessing whoever designed the system wasn't sure whether they would ever need to be different, and designed it so that they could be. It turned out that they didn't need to be, but it was either more work than it was worth to change it (considering that simply passing the same parameter twice is trivial), or they wanted to leave the flexibility in the system in case it's needed in future.

I've definitely had APIs like this in a few places in my code before.

> Yes, AGX is a mobile GPU, designed for the iPhone. The M1 is a screaming fast desktop, but its unified memory and tiler GPU have roots in mobile phones.

PowerVR has its roots in a desktop video card with somewhat limited release and impact. It really took off when it was used in the Sega Dreamcast home console and the Sega Naomi arcade board. It was only later that people put them in phones.

I really appreciate the writing and work that was done here.

It is amazing to me how complicated these systems have become. I am looking over the source for the single triangle demo. Most of this is just about getting information from point A to point B in memory. Over 500 lines worth of GPU protocol overhead... Granted, this is a one-time cost once you get it working, but it's still a lot to think about and manage over time.

I've written software rasterizers that fit neatly within 200 lines and provide very flexible pixel shading techniques. Certainly not capable of running a cyberpunk 2077 scene, but interactive framerates otherwise. In the good case, I can go from a dead stop to final frame buffer in <5 milliseconds. Can you even get the GPU to wake up in that amount of time?

Few things are more enjoyable than reading a good bug story, even when it's not one's area of expertise. Well done.
Huh, I always thought tilers re-ran their vertex shaders multiple times -- once with position-only to do binning, and then again when computing for all attributes with each tile; that's what the "forward tilers" like Adreno/Mali do. That's crazy they dump all geometry to main memory rather than keeping it in pipe. It explains why geometry is more of a limit on AGX/PVR than Adreno/Mali.
That image gave me flashbacks of gnarly shader debugging I did once. IIRC, I was dividing by zero in some very rare branch of a fragment shader, and it caused those black tiles to flicker in and out of existence. Excruciatingly painful to debug on a GPU.
> The Tiled Vertex Buffer is the Parameter Buffer. PB is the PowerVR name, TVB is the public Apple name, and PB is still an internal Apple name.

Patent lawyers love this one silly trick.

Alyssa and the rest of the Asahi team are basically magicians as far as I can tell.

What amazing work and great writing that takes an absolute graphics layman (me) on a very technical journey yet it is still largely understandable.

Really enjoyed the way it was written
> Comparing a trace from our driver to a trace from Metal, looking for any relevant difference, we eventually stumble on the configuration required to make depth buffer flushes work.

> And with that, we get our bunny.

So what was the configuration that needed to change? Don't leave us hanging!!!

Impressive work and really interesting write up. Thanks!
Very interesting and easy to follow writeup, even for a graphics ignoramus like myself.
This is present in a lot of unreal engine games running on mac os x too. Tomb Raider is a great example.
What an entertaining story!
It's been said more than a few times in the past, but I cannot get over just how smart and motivated Alyssa Rosenzweig is - she's currently an undergraduate university student, and was leading the Panfrost project when she was still in high school! Every time I read something she wrote I'm astounded at how competent and eloquent she is.