I don't understand why people are still excited about this stuff
LLMs generate the answer one token at a time and the second token isn't known as long as the first token wasn't generated. Knowing this is crucial for understanding why LLMs generate nonsensical answers trying to explain unexplainable.
The below screenshot illustrates the problem perfectly. The user asks whether a number prime or not. The LLM generates the first word, which is "No" in this case. That's it. There's no way back. It will continue generating a logical explanation of a wrong answer, which is impossible. The chance of generating a wrong first token is never zero, so the situations like this are inevitable.
And there's nothing to laugh at if you understands how LLMs work.