I guess the comparison here is mainly for fun and food for thought, but I think the answer is that it depends on your use case. If there's a system that you are enthusiastic about and want to be able to develop code for, then that is probably what you should pick.
For some reason (not sure why, maybe it was the discussion of portability and this fun NVIDIA not-quite-assembly language), this made me wonder: has anybody gotten really good at writing LLVM IIR? It seems fairly low level, and but also quite portable. And… I don’t know, I’m talking about a topic I don’t know much about, so I’m happy to be corrected here, but as a static-single-assignment language maybe it is… even more machine sympathetic than assembly? (I’m under the impression that writing really high performance assembly is really quite difficult, you have to keep a ton of instructions in flight at once, right?)
There are also some microcontroller architectures worth considering:
* Intel 8051, dating from 1980. You can still buy e.g. the CH559, based on the same architecture with an USB interface retrofitted somehow.
* AVR 8-bit architecture. Readily available in (older) Arduinos or easy to handle chip packages.
For modern architectures, reading skills can be very valuable, as they allow you to chase bugs beyond the high level language border. Astonishes your friends & confounds your enemies. Writing is decidedly more niche.
Is GPU assembly an actually in-demand skill?
Learning some x86 assembly early was a good thing because it taught me a bit about how computers work.
I'm still learning assembly for fun.
Many years ago: x86 for reverse engineering. Nowadays: bootable games (https://github.com/nanochess is an excellent treasure trove) and classic game consoles (GBA, SNES) etc.
The first commenter on the article page states that his favorite is pdp-11 assembly. In the 90s at uni I learned to write assembly on pdp-11 emulator running on a pc. It truly was a nice experience.
The question of which assembly is best to learn is of course incredibly subjective, but I think the author gives short shrift to ARM32. It is historically important (especially for the Acorn computers, most popular in the UK), sensibly designed, and still relevant today, just in the context of microcontrollers.
Some of the most fun I've had programming assembly has been writing HDMI video scanout kernels for a RP2040 chip[1]. It was a delightful puzzle how to make every single cycle count. It is a great sense of satisfaction of using every one of the 8 "low" registers (the other 8 "high" registers generally take one more cycle to move into a low register, but there are exceptions such as add and compare where they can be free; thus you almost always use a high register for the loop termination comparison). Most satisfying, you can cycle-count and predict the performance very accurately, which is not at all true on modern 64 bit processors. These video kernels could not be written in Rust or C with anywhere near the same performance. Also, in general, Rust compiles to pretty verbose code, which matters a lot when you have limited memory.
Ironically, the reasons for this project being on hold also point to the downside of assembler: since then, the RP2350 chip has come out, and huge parts of the project would need to be rewritten (though it would be much, much more capable than the first version).
[1]: https://github.com/DusterTheFirst/pico-dvi-rs/blob/main/src/...