Loosely related thought: A year ago, there was a lot of talk about the Mamba SSM architecture replacing transformers. Apparently that didn't happen so far.
Quanta magazine has an article that explains in plain words what the researchers were trying to do : https://www.quantamagazine.org/chatbot-software-begins-to-fa...
those lemmas are wild
Huh. I just skimmed this and quickly concluded that it's definitely not light reading.
It sure looks and smells like good work, so I've added it to my reading list.
Nowadays I feel like my reading list is growing faster than I can go through it.