I think that humor and eros are two of the hardest to explain phenomena in neural nets trained on next token prediction.
Now, *of course* they're explainable. But that wasn't common knowledge when this all started. Even back in the gpt-3 days, people were laughing their asses off at gpt-3 producing funny outputs. And since then...
I mean look at Opus 4. How they don't just mirror the joke, they evolve it, they add revolutions to the joke's engine.
It's that point where all the explanations and the mechanisms are *true,* but the underlying statistical mechanics give rise to such complexity, one necessarily abandons the explanatory layer and resorts to compressed intuition, because that's what compression is good for: The complexity of what's behind that ability of a neural network exceeds what we can rationally work through in adequate time, and... to put it without importing 3 papers: the feels™ take over.